The muHVT package is a collection of R functions to facilitate building topology preserving maps for rich multivariate data analysis. Tending towards a big data preponderance, a large number of rows. A collection of R functions for this typical workflow is organized below:
Data Compression: Vector quantization (VQ), HVQ (hierarchical vector quantization) using means or medians. This step compresses the rows (long data frame) using a compression objective.
Data Projection: Dimension projection of the compressed cells to 1D,2D or 3D with the Sammons Non-linear Algorithm. This step creates topology preserving map (also called an embedding) coordinates into the desired output dimension.
Tessellation: Create cells required for object visualization using the Voronoi Tessellation method, package includes heatmap plots for hierarchical Voronoi tessellations (HVT). This step enables data insights, visualization, and interaction with the topology preserving map useful for semi-supervised tasks.
Prediction: Scoring new data sets and recording their assignment using the map objects from the above steps, in a sequence of maps if required.
Compression is a technique used to reduce the data size while preserving its essential information, allowing for efficient storage and decompression to reconstruct the original data. While Vector quantization (VQ) is a technique used in data compression to represent a set of data points with a smaller number of representative vectors. It achieves compression by exploiting redundancies or patterns in the data and replacing similar data points with representative vectors.
This package offers several advantages for performing data
compression as it is designed to handle
high-dimensional data more efficiently. It provides a
hierarchical compression approach, allowing
multi-resolution representation of the data. The hierarchical structure
enables efficient compression and storage of
the data while preserving different levels of detail. HVT aims to
preserve the topological structure of the data during
compression.Spatial data with irregular shapes
and complex structures in high-dimensional data can contain valuable
information about relationships and patterns. HVT seeks to capture and
retain these topological characteristics, enabling meaningful analysis
and visualization.This package employs tessellation to divide the
compressed data space into distinct cells or regions while preserving
the topology of the original data. This means that the relationships and
connectivity between data points are maintained in the compressed
representation.
This package can perform vector quantization using the following algorithms-
The second and third steps are iterated until a predefined number of iterations is reached or the clusters converge. The runtime for the algorithm is O(n).
The second and third steps are iterated until a predefined number of iterations is reached or the clusters converge. The runtime for the algorithm is O(k * (n-k)^2).
These algorithm divides the dataset recursively into cells using \(k-means\) or \(k-medoids\) algorithm. The maximum number of subsets are decided by setting \(n_cells\) to, say five, in order to divide the dataset into maximum of five subsets. These five subsets are further divided into five subsets(or less), resulting in a total of twenty five (5*5) subsets. The recursion terminates when the cells either contain less than three data point or a stop criterion is reached. In this case, the stop criterion is set to when the cell error exceeds the quantization threshold.
The steps for this method are as follows:
The stop criterion is when the quantization error of a cell satisfies one of the below conditions:
The quantization error for a cell is defined as follows:
\[QE = \max_i(||A-F_i||_{p})\]
where
Let us try to understand quantization error with an example.
Figure 1: The Voronoi tessellation for level 1 shown for the 5 cells with the points overlayed
An example of a 2 dimensional VQ is shown above.
In the above image, we can see 5 cells with each cell containing a certain number of points. The centroid for each cell is shown in blue. These centroids are also known as codewords since they represent all the points in that cell. The set of all codewords is called a codebook.
Now we want to calculate quantization error for each cell. For the
sake of simplicity, let’s consider only one cell having centroid
A and m data points \(F_i\) for calculating quantization
error.
For each point, we calculate the distance between the point and the centroid.
\[ d = ||A - F_i||_{p} \]
In the above equation, p = 1 means L1_Norm distance
whereas p = 2 means L2_Norm distance. In the package, the
L1_Norm distance is chosen by default. The user can pass
either L1_Norm, L2_Norm or a custom function
to calculate the distance between two points in n dimensions.
\[QE = \max_i(||A-F_i||_{p})\]
Now, we take the maximum calculated distance of all m points. This
gives us the furthest distance of a point in the cell from the centroid,
which we refer to as Quantization Error. If the
Quantization Error is higher than the given threshold, the centroid/
codevector is not a good representation for the points in the cell. Now
we can perform further Vector Quantization on these points and repeat
the above steps.
Please note that the user can select mean, max or any custom function
to calculate the Quantization Error. The custom function takes a vector
of m value (where each value is a distance between point in
n dimensions and centroids) and returns a single value
which is the Quantization Error for the cell.
If we select mean as the error metric, the above
Quantization Error equation will look like this:
\[QE = \frac{1}{m}\sum_{i=1}^m||A-F_i||_{p}\]
Projection mainly involves converting data from its original form to a different space or coordinate system while preserving certain properties of it. By projecting data into a common coordinate system, spatial relationships, distances, areas, and other spatial attributes can be accurately measured and compared.
HVT performs projection as part of its workflow to visualize and explore high-dimensional data. The projection step in HVT involves mapping the compressed data, represented by the hierarchical structure of cells, onto a lower-dimensional space for visualization purposes, as human perception is more suited to interpreting information in lower-dimensional spaces.Users can zoom in/out, rotate, and explore different regions of the projected space to gain insights and understand the data from different perspectives.
Sammon’s projection is an algorithm used in this package to map a high-dimensional space to a space of lower dimensionality while attempting to preserve the structure of inter-point distances in the projection. It is particularly suited for use in exploratory data analysis and is usually considered a non-linear approach since the mapping cannot be represented as a linear combination of the original variables. The centroids are plotted in 2D after performing Sammon’s projection at every level of the tessellation.
Denoting the distance between \(i^{th}\) and \(j^{th}\) objects in the original space by \(d_{ij}^*\), and the distance between their projections by \(d_{ij}\). Sammon’s mapping aims to minimize the below error function, which is often referred to as Sammon’s stress or Sammon’s error.
\[E=\frac{1}{\sum_{i<j} d_{ij}^*}\sum_{i<j}\frac{(d_{ij}^*-d_{ij})^2}{d_{ij}^*}\]
The minimization of this can be performed either by gradient descent, as proposed initially, or by other means, usually involving iterative methods. The number of iterations need to be experimentally determined and convergent solutions are not always guaranteed. Many implementations prefer to use the first Principal Components as a starting configuration.
A Voronoi diagram is a way of dividing space into a number of
regions. A set of points (called seeds, sites, or generators) is
specified beforehand and for each seed, there will be a corresponding
region consisting of all points within proximity of that seed. These
regions are called Voronoi cells. It is complementary to
Delaunay triangulation is a geometrical algorithm used to
create a triangulated mesh from a set of points in a plane which has the
property that no data point lies within the circumcircle of any triangle
in the triangulation. This property guarantees that the resulting cells
in the tessellation do not overlap with each other.
By using Delaunay triangulation, HVT can achieve a
partitioning of the data space into distinct and non-overlapping
regions, which is crucial for accurately representing and analyzing the
compressed data.Additionally, the use of Delaunay triangulation for
tessellation ensures that the resulting cells have well-defined shapes,
typically triangles in two dimensions or tetrahedra in three
dimensions.
The hierarchical structure resulting from tessellation preserves the inherent structure and relationships within the data. It captures clusters, subclusters, and other patterns in the data, allowing for a more organized and interpretable representation. The hierarchical structure reduces redundancy and enables more compact representations.
Tessellate: Constructing Voronoi Tesselation
In this package, we use sammons from the package
MASS to project higher dimensional data to a 2D space. The
function hvq called from the HVT function
returns hierarchical quantized data which will be the input for
construction of the tessellations. The data is then represented in 2D
coordinates and the tessellations are plotted using these coordinates as
centroids. We use the package deldir for this purpose. The
deldir package computes the Delaunay triangulation (and
hence the Dirichlet or Voronoi tessellation) of a planar point set
according to the second (iterative) algorithm of Lee and Schacter. For
subsequent levels, transformation is performed on the 2D coordinates to
get all the points within its parent tile. Tessellations are plotted
using these transformed points as centroids. The lines in the
tessellations are chopped in places so that they do not protrude outside
the parent polygon. This is done for all the subsequent levels.
Prediction basically refers to the process of making predictions or estimating future values or outcomes based on existing data patterns.In data prediction, a model is developed based on historical data or a training dataset, and this model is then used to make predictions on new, unseen data. The model captures the underlying patterns, trends, and relationships present in the training data, allowing it to make informed predictions on similar or related data points.
In this package, we use predictHVT function to predict
each point in the test dataset.
Prediction Algorithm
The prediction algorithm recursively calculates the distance between each point in the test dataset and the cell centroids for each level. The following steps explain the prediction method for a single point in the test dataset:
In this section, we will see how we can use the package to visualize multidimensional data by projecting them to two dimensions using Sammon’s projection and further used for scoring
Data Understanding
First of all, let us see how to generate data for torus. We are using
a library geozoo for this purpose. Geo Zoo (stands for
Geometric Zoo) is a compilation of geometric objects ranging from three
to 10 dimensions. Geo Zoo contains regular or well-known objects, eg
cube and sphere, and some abstract objects, e.g. Boy’s surface, Torus
and Hyper-Torus.
Here, we will generate a 3D torus (a torus is a surface of revolution generated by revolving a circle in three-dimensional space one full revolution about an axis that is coplanar with the circle) with 9000 points.
Raw Torus Dataset
The torus dataset includes the following columns:
Lets, explore the raw torus dataset containing 12000 points. For the sake of brevity we are displaying first 6 rows.
set.seed(240)
# Here p represents dimension of object
# n represents number of points
torus <- geozoo::torus(p = 3,n = 12000)
torus_df <- data.frame(torus$points)
colnames(torus_df) <- c("x","y","z")
torus_df1 <- torus_df %>% round(4)
colnames(torus_df1) <- c("x","y","z")
torus_df1$Row.No <- as.numeric(row.names(torus_df))
torus_df1 <- torus_df1 %>% dplyr::select(Row.No,x,y,z)
Table(head(torus_df1))| Row.No | x | y | z |
|---|---|---|---|
| 1 | -2.6282 | 0.5656 | -0.7253 |
| 2 | -1.4179 | -0.8903 | 0.9455 |
| 3 | -1.0308 | 1.1066 | -0.8731 |
| 4 | 1.8847 | 0.1895 | 0.9944 |
| 5 | -1.9506 | -2.2507 | 0.2071 |
| 6 | -1.4824 | 0.9229 | 0.9672 |
We will first split the torus data into train and test. We will randomly select 9000 data points as training and remaining 3000 data points as testing data.
set.seed(42)
train_indices <- sample(1:nrow(torus_df), 9000)
trainTorus <- torus_df[train_indices, ]
trainTorus_data <- trainTorus %>% round(4)
test_indices <- setdiff(1:nrow(torus_df), train_indices)
testTorus <- torus_df[test_indices, ]Raw Training Dataset
First of all, we will see the randomly selected training data containing (9000 data points). For the sake of brevity we are displaying first six rows.
trainTorus_data$Row.No <- as.numeric(row.names(trainTorus_data))
trainTorus_data <- trainTorus_data %>% dplyr::select(Row.No,x,y,z)
row.names(trainTorus_data) <- NULL
Table(head(trainTorus_data))| Row.No | x | y | z |
|---|---|---|---|
| 10801 | -0.6864 | -0.8709 | 0.4537 |
| 2369 | 0.0470 | -1.4714 | 0.8493 |
| 5273 | 1.4155 | 0.0936 | 0.8136 |
| 9290 | 0.2448 | 1.1402 | -0.5520 |
| 1252 | -2.0865 | 0.0771 | 0.9961 |
| 8826 | 2.9131 | -0.0627 | -0.4061 |
Now let’s have a look at structure and summary of the training data.
str(trainTorus_data)
#> 'data.frame': 9000 obs. of 4 variables:
#> $ Row.No: num 10801 2369 5273 9290 1252 ...
#> $ x : num -0.686 0.047 1.415 0.245 -2.087 ...
#> $ y : num -0.8709 -1.4714 0.0936 1.1402 0.0771 ...
#> $ z : num 0.454 0.849 0.814 -0.552 0.996 ...summary(trainTorus_data)
#> Row.No x y z
#> Min. : 1 Min. :-2.997700 Min. :-2.995600 Min. :-1.000000
#> 1st Qu.: 2988 1st Qu.:-1.151025 1st Qu.:-1.118100 1st Qu.:-0.716225
#> Median : 5986 Median : 0.022200 Median :-0.000600 Median : 0.016950
#> Mean : 5988 Mean :-0.002215 Mean : 0.002805 Mean : 0.004401
#> 3rd Qu.: 8974 3rd Qu.: 1.140325 3rd Qu.: 1.125900 3rd Qu.: 0.719875
#> Max. :12000 Max. : 2.998100 Max. : 2.999300 Max. : 1.000000Raw Testing Dataset
Now, lets have a look at randomly selected testing dataset containing(3000 data points).For the sake of brevity we are displaying first six rows.
test_dataset <- testTorus
test_dataset1 <- round(test_dataset,4)
test_dataset1$Row.No <- row.names(test_dataset)
test_dataset1 <- test_dataset1 %>% dplyr::select(Row.No,x,y,z)
rownames(test_dataset1) <- NULL
Table(head(test_dataset1))| Row.No | x | y | z |
|---|---|---|---|
| 6 | -1.4824 | 0.9229 | 0.9672 |
| 10 | 0.7920 | -1.3482 | -0.8998 |
| 12 | -2.3787 | 1.7986 | -0.1878 |
| 17 | -0.8428 | -0.5436 | 0.0755 |
| 20 | -2.6487 | -0.5745 | 0.7040 |
| 23 | -1.1130 | -0.6516 | -0.7040 |
Now let’s have a look at structure and summary of the test data.
str(test_dataset)
#> 'data.frame': 3000 obs. of 3 variables:
#> $ x: num -1.482 0.792 -2.379 -0.843 -2.649 ...
#> $ y: num 0.923 -1.348 1.799 -0.544 -0.574 ...
#> $ z: num 0.9672 -0.8998 -0.1878 0.0755 0.704 ...summary(test_dataset)
#> x y z
#> Min. :-2.9976672 Min. :-2.99934 Min. :-1.000000
#> 1st Qu.:-1.1408711 1st Qu.:-1.09877 1st Qu.:-0.700378
#> Median :-0.0670732 Median : 0.06562 Median : 0.012098
#> Mean : 0.0008702 Mean : 0.03297 Mean : 0.004486
#> 3rd Qu.: 1.1404037 3rd Qu.: 1.14810 3rd Qu.: 0.713435
#> Max. : 2.9995467 Max. : 2.98818 Max. : 0.999999Now let’s try to visualize the torus (donut) in 3D Space.
knitr::include_graphics('torus_donut.png')Figure 2: 3D Torus
Note: The steps of compression, projection, and tessellation are iteratively performed until a minimum compression rate of 80% is achieved. Once the desired compression is attained, the resulting model object is used for scoring using the predictHVT() function
In this section all the outlined workflow steps provided in the abstract section (Compression, Projection, Tessellation and Prediction) are executed at level 1.
The core function for compression in the workflow is
HVQ, which is called within the HVT function.
we have a parameter called quantization error. This parameter acts as a
threshold and determines the number of levels in the hierarchy. It means
that, if there are ‘n’ number of levels in the hierarchy, then all the
clusters formed till this level will have quantization error equal or
greater than the threshold quantization error. The user can define the
number of clusters in the first level of hierarchy and then each cluster
in first level is sub-divided into the same number of clusters as there
are in the first level. This process continues and each group is divided
into smaller clusters as long as thethreshold quantization error is met.
The output of this technique will be hierarchically arranged vector
quantized data.
However, let’s try to comprehend the HVT function first before moving on.
HVT(
dataset,
min_compression_perc,
n_cells,
depth,
quant.err,
projection.scale,
normalize = T,
distance_metric = c("L1_Norm", "L2_Norm"),
error_metric = c("mean", "max"),
quant_method = c("kmeans", "kmedoids"),
diagnose = TRUE,
hvt_validation = FALSE,
train_validation_split_ratio = 0.8
)Each of the parameters of HVT function have been explained below:
dataset - A dataframe with numeric
columns.
min_compression_perc - An integer
indicating the minimum percent compression rate to be achieved for the
dataset.
n_cells - An integer indicating the
number of cells per hierarchy (level).
depth - An integer indicating the
number of level. (1 = No hierarchy, 2 = 2 level, etc …).
quant.error - A number indicating
the quantization error threshold. A cell will only breakdown into
further cells if the quantization error of the cell is above the defined
quantization error threshold.
projection.scale - A number
indicating the scale factor for the tesselations so as to visualize the
sub-tesselations well enough.
scale_summary - A list with mean
and standard deviation values for all the features in the dataset. Pass
the scale summary when the input dataset is already scaled or normalize
is set to False.
distance_metric - The distance
metric can be L1_Norm or L2_Norm.
L1_Norm is selected by default. The distance metric is used
to calculate the distance between an n dimensional point
and centroid. The user can also pass a custom function to calculate this
distance.
error_metric - The error metric can
be mean or max. max is selected
by default. max will return the max of m
values and mean will take mean of m values
where each value is a distance between a point and centroid of the cell.
Moreover, the user can also pass a custom function to calculate the
error metric.
quant_method - The quantization
method can be kmeans or kmedoids.
kmeans is selected by default.
normalize - A logical value
indicating whether the columns in your dataset need to be normalized.
Default value is TRUE. The algorithm supports Z-score
normalization.
diagnose - A logical value
indicating whether user wants to perform diagnostics on the model.
Default value is TRUE.
hvt_validation - A logical value
indicating whether user wants to holdout a validation set and find mean
absolute deviation of the validation points from the centroid. Default
value is FALSE.
train_validation_split_ratio - A
numeric value indicating train validation split ratio. This argument is
only used when hvt_validation has been set to TRUE. Default value for
the argument is 0.8.
We will use the HVT function to compress our data while
preserving essential features of the dataset. Our goal is to achieve
data compression upto atleast 80%. In situations where the
compression ratio does not meet the desired target, we can explore
adjusting the model parameters as a potential solution. This involves
making modifications to parameters such as the
quantization error threshold or
increasing the number of cells and then rerunning the HVT
function again.
In our example we will iteratively increase the number of cells until the desired compression percentage is reached instead of increasing the quantization threshold because it may reduce the level of detail captured in the data representation
We will pass the below mentioned model parameters along with torus
dataset to HVT function.
Model Parameters
set.seed(240)
hvt.torus <- muHVT::HVT(
torus_df,
n_cells = 100,
depth = 1,
quant.err = 0.1,
projection.scale = 10,
normalize = F,
distance_metric = "L1_Norm",
error_metric = "max",
quant_method = "kmeans"
)Let’s checkout the compression summary.
compressionSummaryTable(hvt.torus[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 100 | 0 | 0 | n_cells: 100 quant.err: 0.1 distance_metric: L1_Norm error_metric: max quant_method: kmeans |
As it can be seen from the table above, none of the 100 cells have reached the quantization threshold error. Therefore we can further subdivide the cells by increasing the n_cells parameters and then see if desired compression (80%) is reached
Let’s retry by increasing the n_cells parameter to 300.
Model Parameters
set.seed(240)
hvt.torus2 <- muHVT::HVT(
torus_df,
n_cells = 300,
depth = 1,
quant.err = 0.1,
projection.scale = 10,
normalize = F,
distance_metric = "L1_Norm",
error_metric = "max",
quant_method = "kmeans"
)Let’s checkout the compression summary again.
compressionSummaryTable(hvt.torus2[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 300 | 5 | 0.02 | n_cells: 300 quant.err: 0.1 distance_metric: L1_Norm error_metric: max quant_method: kmeans |
It can be observed from the table above that only 5 cells out
of 300 i.e. 2% of the cells reached the Quantization Error
threshold. Therefore we can further subdivide the cells by increasing
the n_cells parameters and then see if 80% compression is reached
Since we are yet to achieve the compression of atleast 80%, lets try again by increasing the n_cells parameter to 900.
Model Parameters
set.seed(240)
hvt.torus3 <- muHVT::HVT(
torus_df,
n_cells = 900,
depth = 1,
quant.err = 0.1,
projection.scale = 10,
normalize = F,
distance_metric = "L1_Norm",
error_metric = "max",
quant_method = "kmeans"
)Let’s check the compression summary for torus.
compressionSummaryTable(hvt.torus3[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 900 | 768 | 0.85 | n_cells: 900 quant.err: 0.1 distance_metric: L1_Norm error_metric: max quant_method: kmeans |
By increasing the number of cells to 900, we were
successfully able to compress 85% of the data, so we will
not further subdivide the cells
We successfully compressed 85% of the data using n_cells parameter as 900, the next step involves performing data projection on the compressed data. In this step, the compressed data will be transformed and projected onto a lower-dimensional space to visualize and analyze the data in a more manageable form.
The function sammonsProjection() utilizes the
sammons function from the MASS package being called in
HVT. Sammon’s projection is an algorithm that maps a
high-dimensional space to a space of lower dimensionality while
attempting to preserve the structure of inter-point distances in the
projection.The centroids are plotted in 2D after performing Sammon’s
projection at every level of the tessellation.
lets view the projected 2D coordinates after performing sammon’s projection on the compressed data for the first iteration where we set n_cells parameter as 100. For the sake of brevity we are displaying first six rows.
hvt_torus_coordinates <-hvt.torus[[2]][[1]][["1"]]
centroids <<- list()
coordinates_value <- lapply(1:length(hvt_torus_coordinates), function(x){
centroids <-hvt_torus_coordinates[[x]]
coordinates <- centroids$pt
})
centroid_coordinates<<- do.call(rbind.data.frame, coordinates_value)
colnames(centroid_coordinates) <- c("x_coord","y_coord")
centroid_coordinates$Row.No <- as.numeric(row.names(centroid_coordinates))
centroid_coordinates <- centroid_coordinates %>% dplyr::select(Row.No,x_coord,y_coord)
centroid_coordinates1 <- centroid_coordinates %>% data.frame() %>% round(4)
Table(head(centroid_coordinates1), scroll = T, limit = 20)| Row.No | x_coord | y_coord |
|---|---|---|
| 1 | 15.4686 | 9.1562 |
| 2 | -12.3060 | -3.5491 |
| 3 | -6.9791 | 19.6759 |
| 4 | 9.5694 | -0.5423 |
| 5 | 24.8946 | 17.7822 |
| 6 | 24.0559 | 6.8543 |
Lets see the projected Sammons 2D onto a plane with n_cell set to 100 in first iteration.
ggplot(centroid_coordinates1, aes(x_coord, y_coord)) +
geom_point(color = "blue") +
labs(x = "X", y = "Y")Figure 3: Sammons 2D Plot for 100 cells
lets view the projected 2D coordinates after performing sammon’s projection on the compressed data for the Second iteration where we set n_cells parameter as 300. For the sake of brevity we are displaying first six rows.
hvt_torus_coordinates <-hvt.torus2[[2]][[1]][["1"]]
centroids <<- list()
coordinates_value <- lapply(1:length(hvt_torus_coordinates), function(x){
centroids <-hvt_torus_coordinates[[x]]
coordinates <- centroids$pt
})
centroid_coordinates<<- do.call(rbind.data.frame, coordinates_value)
colnames(centroid_coordinates) <- c("x_coord","y_coord")
centroid_coordinates$Row.No <- as.numeric(row.names(centroid_coordinates))
centroid_coordinates <- centroid_coordinates %>% dplyr::select(Row.No,x_coord,y_coord)
centroid_coordinates2 <- centroid_coordinates %>% data.frame() %>% round(4)
Table(head(centroid_coordinates2), scroll = T, limit = 20)| Row.No | x_coord | y_coord |
|---|---|---|
| 1 | 23.7284 | 5.0557 |
| 2 | -11.2747 | 1.3672 |
| 3 | 11.2157 | 26.5876 |
| 4 | 8.5268 | -3.7218 |
| 5 | 30.3534 | 5.0864 |
| 6 | 29.4938 | -0.6784 |
Lets see the projected Sammons 2D onto a plane with n_cell set to 300 in second iteration.
ggplot(centroid_coordinates2, aes(x_coord, y_coord)) +
geom_point(color = "blue") +
labs(x = "X", y = "Y")Figure 4: Sammons 2D Plot for 300 cells
lets view the projected 2D coordinates after performing sammon’s projection on the compressed data for the third iteration where we set n_cells parameter as 900. For the sake of brevity we are displaying first six rows.
hvt_torus_coordinates <-hvt.torus3[[2]][[1]][["1"]]
centroids <<- list()
coordinates_value <- lapply(1:length(hvt_torus_coordinates), function(x){
centroids <-hvt_torus_coordinates[[x]]
coordinates <- centroids$pt
})
centroid_coordinates<<- do.call(rbind.data.frame, coordinates_value)
colnames(centroid_coordinates) <- c("x_coord","y_coord")
centroid_coordinates$Row.No <- as.numeric(row.names(centroid_coordinates))
centroid_coordinates <- centroid_coordinates %>% dplyr::select(Row.No,x_coord,y_coord)
centroid_coordinates3 <- centroid_coordinates %>% data.frame() %>% round(4)
Table(head(centroid_coordinates3), scroll = T, limit = 20)| Row.No | x_coord | y_coord |
|---|---|---|
| 1 | 19.2964 | -18.4704 |
| 2 | -5.9543 | 10.4406 |
| 3 | 25.5603 | 0.6926 |
| 4 | 1.5064 | -9.0975 |
| 5 | 18.3666 | -24.9166 |
| 6 | 17.3898 | -22.7207 |
Lets see the projected Sammons 2D onto a plane with n_cell set to 900 in third iteration.
ggplot(centroid_coordinates3, aes(x_coord, y_coord)) +
geom_point(color = "blue") +
labs(x = "X", y = "Y")Figure 5: Sammons 2D Plot for 900 cells
The deldir package computes the Delaunay triangulation
(and hence the Dirichlet or Voronoi tessellation) of a planar point set
according to the second (iterative) algorithm of Lee and Schacter. For
subsequent levels, transformation is performed on the 2D coordinates to
get all the points within its parent tile. Tessellations are plotted
using these transformed points as centroids.plotHVT is the
main function to plot hierarchical voronoi tessellation.
Now let’s try to understand plotHVT function. The parameters have been explained in detail below:
plotHVT(hvt.results, line.width, color.vec, pch1 = 21, centroid.size = 3, title = NULL, maxDepth = 1)hvt.results - A list containing the
output of the HVT function which has the details of the tessellations to
be plotted.
line.width - A vector indicating
the line widths of the tessellation boundaries for each layer.
color.vec - A vector indicating the
colors of the tessellations boundaries at each layer.
pch1 - Symbol type of the centroids
of the tessellations (parent levels). Refer points (default =
21).
centroid.size - Size of centroids
of first level tessellations (default = 3).
title - Set a title for the plot
(default = NULL).
maxDepth - An integer indicating
the number of levels. (default = NULL)
To enhance visualization, let’s generate a plot of the Voronoi tessellation for the first iteration where we set n_cells parameter as 100. This plot will provide a visual representation of the Voronoi regions corresponding to the data points, aiding in the analysis and understanding of the data distribution.
muHVT::plotHVT(
hvt.torus,
line.width = c(0.4),
color.vec = c("#141B41"),
centroid.size = 0.6,
maxDepth = 1
)Figure 6: The Voronoi tessellation for layer 1 shown for the 100 cells in the dataset ’torus’
Now, let’s plot the Voronoi tessellation for the second iteration where we set n_cells parameter to 300.
muHVT::plotHVT(
hvt.torus2,
line.width = c(0.4),
color.vec = c("#141B41"),
centroid.size = 0.6,
maxDepth = 1
)Figure 7: The Voronoi tessellation for layer 1 shown for the 300 cells in the dataset ’torus’
Now, let’s plot the Voronoi tessellation again, for the third iteration where we set n_cells parameter to 900.
muHVT::plotHVT(
hvt.torus3,
line.width = c(0.4),
color.vec = c("#141B41"),
centroid.size = 0.6,
maxDepth = 1
)Figure 8: The Voronoi tessellation for layer 1 shown for the 900 cells in the dataset ’torus’
From the presented plot, the inherent structure of the donut can be easily observed in the two-dimensional space
We will now overlay all the features as heatmap over the Voronoi Tessellation plot for better visualization and identification of patterns, trends, and variations in the data.
Heat Maps
Let’s have look at the hvtHmap function which we will
use to overlay a variable as heatmap.
hvtHmap(hvt.results, dataset, child.level, hmap.cols, color.vec ,line.width, palette.color = 6)hvt.results - A list of results
obtained from the HVT function.
dataset - A dataframe containing
the variables to overlay as a heatmap. The user can pass an external
dataset or the dataset that was used to perform hierarchical vector
quantization. The dataset should have the same number of points as the
dataset used to perform hierarchical Vector Quantization in the HVT
function.
child.level - A number indicating
the level for which the heat map is to be plotted.
hmap.cols - The column number of
column name from the dataset indicating the variables for which the heat
map is to be plotted. To plot the quantization error as heatmap, pass
'quant_error'. Similarly to plot the no of points in each
cell as heatmap, pass 'no_of_points' as a
parameter.
color.vec - A color vector such
that length(color.vec) = child.level (default = NULL).
line.width - A line width vector
such that length(line.width) = child.level (default = NULL).
palette.color - A number indicating
the heat map color palette. 1 - rainbow, 2 - heat.colors, 3 -
terrain.colors, 4 - topo.colors, 5 - cm.colors, 6 - BlCyGrYlRd
(Blue,Cyan,Green,Yellow,Red) color (default = 6).
show.points - A boolean indicating
whether the centroids should be plotted on the tessellations (default =
FALSE).
Now let’s plot the Voronoi Tessellation with the heatmap overlaid for all the features in the torus data for better visualization and interpretation of data patterns and distributions.
The heatmaps displayed below provides a visual representation of the spatial characteristics of the torus, allowing us to observe patterns and trends in the distribution of each of the features (n,X,Y and Z). The sheer green shades highlight regions with higher coordinate values in each of the heatmaps, while the indigo shades indicate areas with the lowest coordinate values in each of the heatmaps. By analyzing these heatmaps, we can gain insights into the variations and relationships between each of these features within the torus structure.
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "n",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.8,
show.points = T,
quant.error.hmap = 0.1,
n_cells.hmap = 15
)Figure 9: The Voronoi tessellation for layer 1 and number of cells 900 with the heat map overlaid for No. of entities in each cell in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "x",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.8,
show.points = T,
quant.error.hmap = 0.1,
n_cells.hmap = 15
)Figure 10: The Voronoi tessellation for layer 1 and number of cells 900 with the heat map overlaid for variable x in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "y",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.8,
show.points = T,
quant.error.hmap = 0.1,
n_cells.hmap = 15
)Figure 11: The Voronoi tessellation for layer 1 and number of cells 900 with the heat map overlaid for variable y in the ’torus’ dataset
muHVT::hvtHmap(
hvt.torus3,
torus_df,
child.level = 1,
hmap.cols = "z",
line.width = c(0.4),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.8,
show.points = T,
quant.error.hmap = 0.1,
n_cells.hmap = 15
)Figure 12: The Voronoi tessellation for layer 1 and number of cells 900 with the heat map overlaid for variable z in the ’torus’ dataset
Raw Testing Dataset
Lets have a look at our randomly selected test dataset containing (3000 points) before we pass it to predictHVT function for scoring.
Table(head(test_dataset1))| Row.No | x | y | z |
|---|---|---|---|
| 6 | -1.4824 | 0.9229 | 0.9672 |
| 10 | 0.7920 | -1.3482 | -0.8998 |
| 12 | -2.3787 | 1.7986 | -0.1878 |
| 17 | -0.8428 | -0.5436 | 0.0755 |
| 20 | -2.6487 | -0.5745 | 0.7040 |
| 23 | -1.1130 | -0.6516 | -0.7040 |
However, let’s try to comprehend the predictHVT function first before moving on
predictHVT(data,
hvt.results,
hmap.cols = NULL,
child.level = 1,
...)The important parameters for the function predictHVT are
as below:
data - A dataframe containing the
test dataset. The dataframe should have all the variable(features) used
for training. The variables from this dataset can also be used to
overlay as heatmap.
hvt.results - A list of hvt.results
obtained from the HVT function while performing hierarchical vector
quantization on training data. The list containes detailed information
about the hierarchical vector quantized data along with a summary
section containing no of points, Quantization Error and the centroids
for each cell, as per the n_cells given to the HVT() function.
hmap.cols - The column number of
column name from the dataset indicating the variables for which the heat
map is to be plotted. A heatmap won’t be plotted if NULL is passed
(Default = NULL)
child.level - A number indicating
the level for which the heat map is to be plotted (Only used if
hmap.cols is not NULL) Each level represents a different level of
clustering or partitioning of the data.
normalize - A logical value
indicating if the columns in your dataset should be normalized.
Basically it is a technique that scales the values of each variable to
have a mean of 0 and a standard deviation of 1.. Default value is
TRUE.
distance_metric - It specifies the
type of distance measurement used to calculate similarity or
dissimilarity between data points. It can be set to “Euclidean”
(default) for straight-line distance or “Manhattan” for the sum of
absolute differences between coordinates.
error_metric - It specifies the
error metrics to be used for evaluating the performance of the model. It
can be “mean” or “max”. mean is selected by default.
yVar - Name of the dependent
variable(s)
... - color.vec and line.width can
be passed from here
Now once we have built the model, let us try to predict using our test dataset which cell and which level each point belongs to.
set.seed(240)
predictions_torus <- muHVT::predictHVT(
testTorus,
hvt.torus3,
child.level = 1,
line.width = c(1.2),
color.vec = c("#141B41"),
quant.error.hmap = 0.1,
n_cells.hmap = 900,
normalize = F
)Let’s see which cell and level each point belongs to and check the mean absolute difference. For the sake of brevity, we will only show the first 10 rows
data1 <- test_dataset
data1$Row.No <- row.names(test_dataset)
data1 <- data1 %>% dplyr::select(Row.No,x,y,z)
rownames(data1) <- NULL
colnames(data1) <- c("Row.No","x_act","y_act","z_act")
data2 <- predictions_torus[["scoredPredictedData"]]
data2 <- data2 %>% dplyr::select(Cell.ID,x,y,z)
colnames(data2) <- c("Cell.ID","x_pred","y_pred","z_pred")
combined <- cbind(data1,data2)
combined$diff <- rowMeans(abs(combined[, c("x_act", "y_act", "z_act")] - combined[, c("x_pred", "y_pred", "z_pred")]))
options(scipen = 999)
combined %>% head(100) %>%
as.data.frame() %>%
Table(scroll = T, limit = 10)| Row.No | x_act | y_act | z_act | Cell.ID | x_pred | y_pred | z_pred | diff |
|---|---|---|---|---|---|---|---|---|
| 6 | -1.4823709 | 0.9228529 | 0.9672467 | 723 | -1.4824 | 0.9229 | 0.9672 | 0.0000410 |
| 10 | 0.7920450 | -1.3482111 | -0.8997781 | 252 | 0.7920 | -1.3482 | -0.8998 | 0.0000260 |
| 12 | -2.3787465 | 1.7986402 | -0.1878163 | 900 | -2.3787 | 1.7986 | -0.1878 | 0.0000344 |
| 17 | -0.8427718 | -0.5435588 | 0.0755262 | 558 | -0.8428 | -0.5436 | 0.0755 | 0.0000318 |
| 20 | -2.6486525 | -0.5744624 | 0.7039659 | 837 | -2.6487 | -0.5745 | 0.7040 | 0.0000397 |
| 23 | -1.1130327 | -0.6516259 | -0.7039507 | 628 | -1.1130 | -0.6516 | -0.7040 | 0.0000360 |
| 28 | 0.7520403 | -2.6043863 | 0.7034024 | 140 | 0.7520 | -2.6044 | 0.7034 | 0.0000188 |
| 30 | -1.6755282 | 2.3358857 | 0.4847097 | 859 | -1.6755 | 2.3359 | 0.4847 | 0.0000174 |
| 33 | -1.6466922 | 0.4011827 | -0.9523068 | 719 | -1.6467 | 0.4012 | -0.9523 | 0.0000106 |
| 34 | 0.7930278 | 2.4427927 | 0.8228262 | 458 | 0.7930 | 2.4428 | 0.8228 | 0.0000204 |
hist(combined$diff, breaks = 20, col = "blue", main = "Mean Absolute Difference", xlab = "Difference")Figure 13: Mean Absolute Difference
Data Understanding
In this section, we will use the
Prices of Personal Computers dataset. This dataset contains
6259 observations and 10 features. The dataset observes the price from
1993 to 1995 of 486 personal computers in the US. The variables are
price, speed, ram, screen, cd, etc. The dataset can be downloaded from
here.
In this example, we will compress this dataset by using hierarchical VQ via k-means and visualize the Voronoi Tessellation plots using Sammons projection. Later on, we will overlay all the variables as a heatmap to generate further insights.
Here, we load the data and store into a variable
computers.
set.seed(240)
# Load data from csv files
computers <- read.csv("https://raw.githubusercontent.com/Mu-Sigma/muHVT/master/vignettes/sample_dataset/Computers.csv")Raw Personal Computers Dataset
The Computers dataset includes the following columns:
Let’s explore the Personal Computers Dataset containing (6259 points). For the sake of brevity we are displaying first six rows.
# Quick peek
Table(head(computers), scroll = T, limit = 20)| X | price | speed | hd | ram | screen | cd | multi | premium | ads | trend |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1499 | 25 | 80 | 4 | 14 | no | no | yes | 94 | 1 |
| 2 | 1795 | 33 | 85 | 2 | 14 | no | no | yes | 94 | 1 |
| 3 | 1595 | 25 | 170 | 4 | 15 | no | no | yes | 94 | 1 |
| 4 | 1849 | 25 | 170 | 8 | 14 | no | no | no | 94 | 1 |
| 5 | 3295 | 33 | 340 | 16 | 14 | no | no | yes | 94 | 1 |
| 6 | 3695 | 66 | 340 | 16 | 14 | no | no | yes | 94 | 1 |
Now, let us check the structure of the data and analyse its summary.
str(computers)
#> 'data.frame': 6259 obs. of 11 variables:
#> $ X : int 1 2 3 4 5 6 7 8 9 10 ...
#> $ price : int 1499 1795 1595 1849 3295 3695 1720 1995 2225 2575 ...
#> $ speed : int 25 33 25 25 33 66 25 50 50 50 ...
#> $ hd : int 80 85 170 170 340 340 170 85 210 210 ...
#> $ ram : int 4 2 4 8 16 16 4 2 8 4 ...
#> $ screen : int 14 14 15 14 14 14 14 14 14 15 ...
#> $ cd : chr "no" "no" "no" "no" ...
#> $ multi : chr "no" "no" "no" "no" ...
#> $ premium: chr "yes" "yes" "yes" "no" ...
#> $ ads : int 94 94 94 94 94 94 94 94 94 94 ...
#> $ trend : int 1 1 1 1 1 1 1 1 1 1 ...summary(computers)
#> X price speed hd
#> Min. : 1 Min. : 949 Min. : 25.00 Min. : 80.0
#> 1st Qu.:1566 1st Qu.:1794 1st Qu.: 33.00 1st Qu.: 214.0
#> Median :3130 Median :2144 Median : 50.00 Median : 340.0
#> Mean :3130 Mean :2220 Mean : 52.01 Mean : 416.6
#> 3rd Qu.:4694 3rd Qu.:2595 3rd Qu.: 66.00 3rd Qu.: 528.0
#> Max. :6259 Max. :5399 Max. :100.00 Max. :2100.0
#> ram screen cd multi
#> Min. : 2.000 Min. :14.00 Length:6259 Length:6259
#> 1st Qu.: 4.000 1st Qu.:14.00 Class :character Class :character
#> Median : 8.000 Median :14.00 Mode :character Mode :character
#> Mean : 8.287 Mean :14.61
#> 3rd Qu.: 8.000 3rd Qu.:15.00
#> Max. :32.000 Max. :17.00
#> premium ads trend
#> Length:6259 Min. : 39.0 Min. : 1.00
#> Class :character 1st Qu.:162.5 1st Qu.:10.00
#> Mode :character Median :246.0 Median :16.00
#> Mean :221.3 Mean :15.93
#> 3rd Qu.:275.0 3rd Qu.:21.50
#> Max. :339.0 Max. :35.00Let us first split the data into train and test. We will randomly select 80% of the data as train and remaining as test.
num_rows <- nrow(computers)
set.seed(123)
train_indices <- sample(1:num_rows, 0.8 * num_rows)
trainComputers <- computers[train_indices, ]
testComputers <- computers[-train_indices, ]K-means is not suitable for factor variables as the sample space for factor variables is discrete. A Euclidean distance function on such a space isn’t really meaningful. Hence, we will delete the factor variables(X, cd, multi, premium, trend) in our dataset.
trainComputers <-
trainComputers %>% dplyr::select(-c(X, cd, multi, premium, trend))
testComputers <-
testComputers %>% dplyr::select(-c(X, cd, multi, premium, trend))Raw Training Dataset
Now, lets have a look at the randomly selected raw training dataset containing (5007 data points). For the sake of brevity we are displaying first six rows.
trainComputers_data <- trainComputers %>% as.data.frame() %>% round(4)
trainComputers_data$Row.No <- as.numeric(row.names(trainComputers_data))
trainComputers_data <- trainComputers_data %>% dplyr::select(Row.No,price,speed,hd,ram,screen,ads)
row.names(trainComputers_data) <- NULL
Table(head(trainComputers_data))| Row.No | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|
| 2463 | 2799 | 50 | 230 | 8 | 15 | 216 |
| 2511 | 2197 | 33 | 270 | 4 | 14 | 216 |
| 2227 | 2744 | 50 | 340 | 8 | 17 | 275 |
| 526 | 2999 | 66 | 245 | 16 | 15 | 139 |
| 4291 | 1974 | 33 | 200 | 4 | 14 | 248 |
| 2986 | 2490 | 33 | 528 | 16 | 14 | 267 |
Raw Testing Dataset
Now, lets have a look at the randomly selected raw testing dataset containing (1252 data points). For the sake of brevity we are displaying first six rows.
#testComputers <- scale(testComputers, center = scale_attr$`scaled:center`, scale = scale_attr$`scaled:scale`)
testComputers_data <- testComputers %>% as.data.frame() %>% round(4)
testComputers_data$Row.No <- as.numeric(row.names(testComputers_data))
testComputers_data <- testComputers_data %>% dplyr::select(Row.No,price,speed,hd,ram,screen,ads)
rownames(testComputers_data) <- NULL
Table(head(testComputers_data))| Row.No | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|
| 3 | 1595 | 25 | 170 | 4 | 15 | 94 |
| 4 | 1849 | 25 | 170 | 8 | 14 | 94 |
| 7 | 1720 | 25 | 170 | 4 | 14 | 94 |
| 10 | 2575 | 50 | 210 | 4 | 15 | 94 |
| 11 | 2195 | 33 | 170 | 8 | 15 | 94 |
| 14 | 2295 | 25 | 245 | 8 | 14 | 94 |
As we are familiar with the structure of the computers data, we will now follow the following steps to get the predictions using the Computers dataset.
For more detailed information on Data Compression please refer to section 2 of this vignette.
We will use the HVT function to compress our data while
preserving essential features of the dataset. Our goal is to achieve
data compression upto atleast 80%. In situations where the
compression ratio does not meet the desired target, we can explore
adjusting the model parameters as a potential solution. This involves
making modifications to parameters such as the
quantization error threshold or
increasing the number of cells and then rerunning the HVT
function again.
In our example we will iteratively increase the number of cells until the desired compression percentage is reached instead of increasing the quantization threshold because it may reduce the level of detail captured in the data representation
We will pass the below mentioned model parameters along with
computers dataset to HVT function.
Model Parameters
set.seed(240)
hvt.results <- list()
hvt.results <- muHVT::HVT(trainComputers,
n_cells = 440,
depth = 1,
quant.err = 0.2,
projection.scale = 10,
normalize = T,
distance_metric = "L1_Norm",
error_metric = "max",
quant_method = "kmeans",
diagnose = F)Now let’s check the compression summary. The table below shows no of cells, no of cells having quantization error below threshold and percentage of cells having quantization error below threshold for each level.
compressionSummaryTable(hvt.results[[3]]$compression_summary)| segmentLevel | noOfCells | noOfCellsBelowQuantizationError | percentOfCellsBelowQuantizationErrorThreshold | parameters |
|---|---|---|---|---|
| 1 | 440 | 355 | 0.81 | n_cells: 440 quant.err: 0.2 distance_metric: L1_Norm error_metric: max quant_method: kmeans |
As it can be seen from the table above,
81% of the cells have reached the
quantization threshold error. Since we are successfully able to attain
the desired compression percentage, so we will not further subdivide the
cells
hvt.results[[3]] gives us detailed
information about the hierarchical vector quantized data.
hvt.results[[3]][['summary']] gives a
nice tabular data containing no of points, Quantization Error and the
codebook.
The datatable displayed below is the summary from hvt.results
summaryTable(hvt.results[[3]]$summary)| Segment.Level | Segment.Parent | Segment.Child | n | Cell.ID | Quant.Error | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 7 | 46 | 0.08 | -0.76 | -0.89 | -0.88 | -0.76 | -0.67 | 1.57 |
| 1 | 1 | 2 | 10 | 108 | 0.08 | -0.80 | -0.89 | -0.16 | -0.76 | -0.67 | 0.67 |
| 1 | 1 | 3 | 15 | 223 | 0.12 | 0.37 | -0.89 | -0.72 | -0.05 | 0.43 | -1.65 |
| 1 | 1 | 4 | 11 | 54 | 0.07 | -1.50 | -0.89 | -0.75 | -0.76 | -0.67 | 0.62 |
| 1 | 1 | 5 | 8 | 146 | 0.13 | -0.31 | 0.68 | -0.95 | -0.89 | -0.67 | -0.14 |
| 1 | 1 | 6 | 11 | 150 | 0.16 | -0.66 | 0.68 | -0.78 | -0.79 | -0.67 | -0.73 |
| 1 | 1 | 7 | 11 | 170 | 0.1 | 0.03 | -1.24 | -0.13 | -0.05 | -0.67 | 0.38 |
| 1 | 1 | 8 | 8 | 334 | 0.15 | 0.62 | 2.30 | 0.08 | -0.05 | 0.43 | 0.04 |
| 1 | 1 | 9 | 8 | 114 | 0.07 | -0.16 | 0.68 | -1.19 | -1.11 | -0.67 | 0.87 |
| 1 | 1 | 10 | 7 | 248 | 0.17 | 0.51 | -0.08 | 0.34 | -0.05 | -0.67 | -0.30 |
| 1 | 1 | 11 | 9 | 140 | 0.12 | -0.01 | 0.68 | -1.15 | -1.00 | -0.67 | 0.33 |
| 1 | 1 | 12 | 7 | 219 | 0.14 | -1.36 | 0.24 | 0.46 | -0.05 | -0.67 | -0.74 |
| 1 | 1 | 13 | 9 | 271 | 0.05 | -1.08 | 0.68 | 0.49 | -0.05 | 0.43 | -0.84 |
| 1 | 1 | 14 | 19 | 109 | 0.06 | -0.31 | -0.89 | -0.74 | -0.76 | -0.67 | 0.38 |
| 1 | 1 | 15 | 6 | 176 | 0.08 | -0.72 | -0.89 | -0.07 | -0.05 | 0.43 | 0.72 |
| 1 | 1 | 16 | 17 | 332 | 0.14 | 0.42 | 2.30 | 0.10 | -0.05 | 0.43 | 1.50 |
| 1 | 1 | 17 | 12 | 18 | 0.05 | -1.21 | -1.27 | -1.19 | -1.11 | -0.67 | 0.97 |
| 1 | 1 | 18 | 19 | 149 | 0.16 | -0.68 | -0.08 | -0.46 | -0.76 | 0.43 | 0.79 |
| 1 | 1 | 19 | 17 | 428 | 0.35 | 0.18 | 2.30 | 2.53 | 1.37 | 0.43 | -2.22 |
| 1 | 1 | 20 | 20 | 320 | 0.36 | 0.82 | -0.16 | -0.09 | -0.12 | 2.64 | 0.71 |
| 1 | 1 | 21 | 3 | 305 | 0.18 | 2.27 | -0.35 | -0.01 | -0.05 | -0.67 | -1.32 |
| 1 | 1 | 22 | 7 | 227 | 0.1 | -0.51 | -0.89 | 0.45 | -0.05 | 0.43 | -0.47 |
| 1 | 1 | 23 | 10 | 178 | 0.12 | 0.00 | 0.68 | -0.86 | -0.76 | -0.67 | -0.90 |
| 1 | 1 | 24 | 9 | 365 | 0.1 | 0.68 | -0.08 | 1.20 | 1.37 | 0.43 | -0.36 |
| 1 | 1 | 25 | 5 | 14 | 0.11 | -1.99 | -0.89 | -0.96 | -1.11 | -0.67 | 0.18 |
| 1 | 1 | 26 | 3 | 411 | 0.05 | 1.25 | -0.89 | 2.29 | 2.79 | 0.43 | 0.57 |
| 1 | 1 | 27 | 18 | 122 | 0.15 | -0.18 | -0.98 | -0.85 | -0.76 | 0.43 | 0.68 |
| 1 | 1 | 28 | 15 | 189 | 0.11 | 0.40 | -0.92 | 0.03 | -0.05 | -0.67 | 0.87 |
| 1 | 1 | 29 | 11 | 107 | 0.11 | -0.49 | -0.96 | -0.88 | -0.76 | -0.67 | -0.64 |
| 1 | 1 | 30 | 7 | 423 | 0.47 | 3.55 | 0.12 | 2.51 | 1.37 | -0.67 | 0.44 |
| 1 | 1 | 31 | 14 | 90 | 0.05 | -0.63 | -0.89 | -0.79 | -0.76 | -0.67 | 0.58 |
| 1 | 1 | 32 | 22 | 430 | 0.24 | 0.63 | 0.75 | 3.07 | 2.79 | 0.43 | -2.27 |
| 1 | 1 | 33 | 5 | 390 | 0.3 | 1.37 | -0.89 | 3.73 | -0.19 | -0.45 | 0.70 |
| 1 | 1 | 34 | 25 | 101 | 0.18 | -0.85 | -0.97 | -0.71 | -0.76 | 0.43 | 0.84 |
| 1 | 1 | 35 | 11 | 425 | 0.07 | 0.15 | 2.30 | 1.70 | 1.37 | 0.43 | -2.39 |
| 1 | 1 | 36 | 10 | 358 | 0.05 | 0.24 | -0.89 | 1.20 | 1.37 | 0.43 | -0.84 |
| 1 | 1 | 37 | 16 | 166 | 0.11 | 0.03 | 0.68 | -1.08 | -0.78 | -0.67 | -1.65 |
| 1 | 1 | 38 | 13 | 45 | 0.05 | -0.91 | -0.89 | -1.19 | -1.11 | -0.67 | 0.42 |
| 1 | 1 | 39 | 8 | 383 | 0.12 | 1.15 | 2.30 | 0.45 | 1.37 | -0.67 | -0.16 |
| 1 | 1 | 40 | 5 | 9 | 0.07 | -1.24 | -0.97 | -1.19 | -1.11 | -0.67 | 1.57 |
| 1 | 1 | 41 | 11 | 419 | 0.06 | 1.41 | -0.89 | 2.29 | 2.79 | 0.43 | -0.82 |
| 1 | 1 | 42 | 8 | 242 | 0.14 | -0.81 | -0.08 | 0.30 | -0.05 | 0.43 | -0.65 |
| 1 | 1 | 43 | 13 | 179 | 0.09 | 0.41 | 0.68 | -0.76 | -0.76 | -0.67 | 0.40 |
| 1 | 1 | 44 | 5 | 375 | 0.04 | 0.06 | -0.08 | 1.70 | 1.37 | 0.43 | -0.79 |
| 1 | 1 | 45 | 20 | 129 | 0.14 | -1.15 | 0.68 | -0.79 | -0.76 | -0.67 | -0.40 |
| 1 | 1 | 46 | 10 | 292 | 0.22 | 0.86 | 0.68 | -0.65 | -0.12 | 0.43 | -1.41 |
| 1 | 1 | 47 | 5 | 79 | 0.12 | -0.89 | -1.04 | -0.94 | -0.05 | -0.67 | 1.02 |
| 1 | 1 | 48 | 23 | 246 | 0.11 | -0.42 | 0.68 | 0.46 | -0.05 | -0.67 | -0.63 |
| 1 | 1 | 49 | 8 | 207 | 0.25 | 0.74 | -0.89 | -0.40 | -0.40 | 0.43 | 0.52 |
| 1 | 1 | 50 | 11 | 27 | 0.06 | -1.06 | -1.27 | -1.19 | -1.11 | -0.67 | 0.43 |
| 1 | 1 | 51 | 8 | 51 | 0.11 | -1.23 | -0.08 | -1.05 | -0.85 | -0.67 | 0.94 |
| 1 | 1 | 52 | 19 | 288 | 0.09 | 0.81 | -0.89 | 0.45 | 1.37 | -0.67 | 0.88 |
| 1 | 1 | 53 | 7 | 154 | 0.1 | -0.62 | 0.68 | -0.15 | -0.76 | -0.67 | 0.84 |
| 1 | 1 | 54 | 10 | 261 | 0.15 | 0.61 | -0.08 | -0.67 | -0.05 | 0.43 | -1.40 |
| 1 | 1 | 55 | 10 | 195 | 0.15 | 0.18 | 0.77 | -0.09 | -0.76 | -0.67 | 0.83 |
| 1 | 1 | 56 | 14 | 250 | 0.09 | 0.52 | 0.68 | -0.69 | -0.05 | -0.67 | -1.61 |
| 1 | 1 | 57 | 20 | 331 | 0.2 | -0.63 | 0.68 | 0.30 | -0.76 | 2.64 | -0.95 |
| 1 | 1 | 58 | 9 | 379 | 0.15 | 1.33 | -0.08 | -0.65 | -0.29 | 2.64 | -1.52 |
| 1 | 1 | 59 | 14 | 11 | 0.21 | -0.28 | -1.05 | -0.79 | -0.76 | 2.64 | 0.53 |
| 1 | 1 | 60 | 29 | 359 | 0.13 | 1.36 | 0.68 | 0.18 | 1.37 | 0.43 | 0.77 |
| 1 | 1 | 61 | 6 | 337 | 0.1 | 2.46 | 0.68 | 0.21 | -0.05 | -0.67 | -0.87 |
| 1 | 1 | 62 | 6 | 1 | 0.17 | -0.17 | -1.21 | -1.02 | -0.76 | 2.64 | 1.32 |
| 1 | 1 | 63 | 28 | 243 | 0.33 | -0.33 | 2.30 | -0.23 | -0.46 | -0.67 | -0.88 |
| 1 | 1 | 64 | 8 | 274 | 0.07 | 0.41 | -1.27 | 0.45 | 1.37 | -0.67 | 0.68 |
| 1 | 1 | 65 | 13 | 362 | 0.14 | 1.07 | 0.75 | 0.35 | 1.37 | 0.43 | 1.34 |
| 1 | 1 | 66 | 10 | 143 | 0.07 | -0.34 | -0.89 | -0.80 | -0.05 | -0.67 | -1.66 |
| 1 | 1 | 67 | 4 | 265 | 0.05 | -0.55 | 0.68 | 1.23 | -0.05 | -0.67 | -0.69 |
| 1 | 1 | 68 | 11 | 13 | 0.15 | -0.83 | -0.89 | -0.25 | -0.76 | 2.64 | -0.33 |
| 1 | 1 | 69 | 8 | 298 | 0.17 | -0.62 | 0.20 | 2.29 | -0.05 | -0.67 | -0.95 |
| 1 | 1 | 70 | 4 | 335 | 0.06 | 1.34 | -0.08 | 0.45 | 1.37 | -0.67 | -0.08 |
| 1 | 1 | 71 | 20 | 204 | 0.16 | 0.09 | -0.08 | 0.02 | -0.05 | -0.67 | 0.86 |
| 1 | 1 | 72 | 10 | 42 | 0.06 | -1.49 | -0.89 | -0.75 | -0.76 | -0.67 | 1.04 |
| 1 | 1 | 73 | 1 | 429 | 0 | 3.08 | 0.68 | 0.04 | 4.20 | 0.43 | 0.71 |
| 1 | 1 | 74 | 14 | 186 | 0.14 | -0.79 | -0.89 | 0.45 | -0.05 | -0.67 | -0.68 |
| 1 | 1 | 75 | 4 | 410 | 0.37 | 2.27 | 0.68 | 3.73 | -0.23 | -0.40 | 0.68 |
| 1 | 1 | 76 | 9 | 163 | 0.16 | 1.05 | -0.89 | -0.41 | -0.60 | -0.67 | 0.61 |
| 1 | 1 | 77 | 10 | 400 | 0.07 | -0.03 | 0.68 | 1.70 | 1.37 | 0.43 | -2.38 |
| 1 | 1 | 78 | 6 | 275 | 0.18 | 1.14 | 0.68 | 0.13 | -0.05 | -0.67 | -0.18 |
| 1 | 1 | 79 | 25 | 241 | 0.16 | -0.88 | 0.68 | 0.40 | -0.05 | -0.67 | -1.06 |
| 1 | 1 | 80 | 6 | 245 | 0.14 | -1.22 | 0.68 | -0.30 | -0.05 | 0.43 | -0.91 |
| 1 | 1 | 81 | 21 | 120 | 0.16 | -0.46 | -0.08 | -0.82 | -0.62 | -0.67 | 0.70 |
| 1 | 1 | 82 | 11 | 40 | 0.18 | -0.93 | -0.99 | -1.19 | -1.11 | 0.43 | 0.37 |
| 1 | 1 | 83 | 9 | 342 | 0.28 | 1.16 | 0.43 | -0.52 | -0.68 | 2.64 | 1.12 |
| 1 | 1 | 84 | 8 | 286 | 0.05 | -0.72 | 1.11 | 0.50 | -0.05 | 0.43 | -0.83 |
| 1 | 1 | 85 | 8 | 33 | 0.1 | -1.06 | -0.89 | -1.18 | -1.02 | -0.67 | -0.99 |
| 1 | 1 | 86 | 5 | 282 | 0.23 | -1.50 | 0.53 | 0.19 | -0.48 | 0.43 | -2.16 |
| 1 | 1 | 87 | 17 | 137 | 0.07 | -0.31 | 0.68 | -0.78 | -0.76 | -0.67 | 0.96 |
| 1 | 1 | 88 | 19 | 168 | 0.07 | -0.08 | -0.89 | 0.05 | -0.05 | -0.67 | 1.04 |
| 1 | 1 | 89 | 7 | 291 | 0.05 | 1.10 | -0.89 | 0.15 | 1.37 | -0.67 | 0.36 |
| 1 | 1 | 90 | 6 | 24 | 0.19 | -0.97 | -1.02 | -1.07 | -0.94 | 0.43 | -1.40 |
| 1 | 1 | 91 | 4 | 434 | 0.59 | 4.07 | 1.49 | 1.29 | 0.30 | 2.64 | 0.07 |
| 1 | 1 | 92 | 19 | 409 | 0.65 | 0.92 | 0.42 | 1.41 | 1.37 | 2.64 | -0.60 |
| 1 | 1 | 93 | 9 | 393 | 0.46 | 2.04 | 2.30 | 0.91 | -0.05 | -0.31 | -0.34 |
| 1 | 1 | 94 | 8 | 22 | 0.08 | -1.66 | -1.27 | -0.84 | -0.76 | -0.67 | 0.84 |
| 1 | 1 | 95 | 23 | 158 | 0.22 | -1.41 | 0.68 | -0.20 | -0.70 | -0.67 | -1.05 |
| 1 | 1 | 96 | 11 | 330 | 0.22 | 1.55 | 0.68 | -0.45 | -0.31 | 0.43 | -1.59 |
| 1 | 1 | 97 | 6 | 145 | 0.17 | 0.49 | -0.89 | -0.64 | -0.76 | -0.67 | -0.17 |
| 1 | 1 | 98 | 12 | 121 | 0.15 | 0.21 | -0.89 | -0.80 | -0.76 | -0.67 | -1.70 |
| 1 | 1 | 99 | 14 | 329 | 0.24 | 2.06 | 0.63 | 0.31 | -0.25 | 0.43 | 0.99 |
| 1 | 1 | 100 | 16 | 299 | 0.05 | 1.21 | -0.89 | 0.46 | 1.37 | -0.67 | 0.86 |
| 1 | 1 | 101 | 5 | 328 | 0.1 | -0.87 | 1.11 | 1.33 | -0.05 | 0.43 | -1.13 |
| 1 | 1 | 102 | 10 | 278 | 0.1 | 0.33 | -0.93 | 0.45 | 1.37 | -0.67 | 0.02 |
| 1 | 1 | 103 | 5 | 102 | 0.1 | -1.05 | 0.68 | -0.76 | -0.76 | -0.67 | 1.27 |
| 1 | 1 | 104 | 12 | 25 | 0.08 | -1.25 | -1.05 | -0.78 | -0.76 | -0.67 | 1.57 |
| 1 | 1 | 105 | 9 | 385 | 0.25 | 2.05 | 0.34 | -0.21 | -0.05 | 2.64 | 1.12 |
| 1 | 1 | 106 | 10 | 231 | 0.18 | -0.02 | -0.08 | 0.02 | -0.05 | 0.43 | 1.29 |
| 1 | 1 | 107 | 5 | 193 | 0.14 | -0.66 | -0.08 | -0.33 | -0.05 | -0.67 | -1.01 |
| 1 | 1 | 108 | 4 | 418 | 0.06 | -0.03 | 0.68 | 3.07 | 1.37 | 0.43 | -2.25 |
| 1 | 1 | 109 | 4 | 306 | 0.16 | 1.83 | 0.68 | -0.59 | -0.05 | -0.67 | -1.57 |
| 1 | 1 | 110 | 5 | 378 | 0.15 | 1.09 | 0.68 | 1.20 | 1.37 | 0.43 | -0.11 |
| 1 | 1 | 111 | 2 | 308 | 0.07 | -0.55 | 0.68 | 0.26 | 1.37 | -0.67 | -1.25 |
| 1 | 1 | 112 | 18 | 239 | 0.15 | 0.44 | 0.75 | 0.08 | -0.05 | -0.67 | 1.05 |
| 1 | 1 | 113 | 11 | 62 | 0.06 | -1.14 | -0.89 | -0.81 | -0.76 | -0.67 | 0.73 |
| 1 | 1 | 114 | 5 | 169 | 0.07 | -0.26 | -0.89 | 0.47 | -0.05 | -0.67 | 1.40 |
| 1 | 1 | 115 | 18 | 415 | 0.11 | 0.74 | -0.08 | 2.29 | 2.79 | 0.43 | -0.93 |
| 1 | 1 | 116 | 5 | 97 | 0.13 | 0.23 | -1.19 | -1.05 | -0.76 | -0.67 | 0.77 |
| 1 | 1 | 117 | 24 | 19 | 0.22 | -0.06 | 2.30 | -0.89 | -0.89 | -0.67 | 1.18 |
| 1 | 1 | 118 | 9 | 184 | 0.17 | -0.98 | 0.34 | -0.29 | -0.05 | -0.67 | -0.08 |
| 1 | 1 | 119 | 11 | 43 | 0.12 | -0.82 | -0.89 | -1.04 | -0.79 | -0.67 | -1.65 |
| 1 | 1 | 120 | 9 | 257 | 0.29 | 2.17 | -0.62 | 0.10 | -0.05 | -0.55 | 0.65 |
| 1 | 1 | 121 | 27 | 155 | 0.18 | 0.15 | 0.68 | -0.84 | -0.76 | -0.67 | 0.81 |
| 1 | 1 | 122 | 20 | 215 | 0.11 | 0.33 | -0.08 | -0.69 | -0.05 | -0.67 | -1.64 |
| 1 | 1 | 123 | 14 | 348 | 0.24 | 0.61 | 0.74 | 0.37 | 1.37 | 0.43 | 0.25 |
| 1 | 1 | 124 | 13 | 162 | 0.12 | 0.01 | -1.01 | -0.67 | -0.05 | -0.67 | -1.57 |
| 1 | 1 | 125 | 7 | 366 | 0.22 | 1.62 | 0.35 | -0.29 | 1.37 | -0.67 | -1.65 |
| 1 | 1 | 126 | 9 | 77 | 0.02 | -0.88 | -0.89 | -0.78 | -0.76 | -0.67 | 0.66 |
| 1 | 1 | 127 | 25 | 253 | 0.14 | 0.70 | 0.68 | 0.16 | -0.05 | -0.67 | 0.61 |
| 1 | 1 | 128 | 9 | 84 | 0.09 | -0.27 | -1.27 | -0.81 | -0.76 | -0.67 | 0.81 |
| 1 | 1 | 129 | 3 | 398 | 0.04 | -0.03 | -0.08 | 2.29 | 1.37 | 0.43 | -1.98 |
| 1 | 1 | 130 | 9 | 412 | 0.05 | 0.90 | -0.89 | 2.29 | 2.79 | 0.43 | -0.48 |
| 1 | 1 | 131 | 8 | 309 | 0.06 | 1.43 | -0.89 | 0.41 | 1.37 | -0.67 | 0.46 |
| 1 | 1 | 132 | 8 | 267 | 0.08 | 0.10 | 0.68 | 0.37 | -0.05 | 0.43 | 1.57 |
| 1 | 1 | 133 | 16 | 252 | 0.22 | 0.94 | -0.08 | -0.40 | -0.14 | 0.43 | 0.43 |
| 1 | 1 | 134 | 15 | 119 | 0.13 | -0.60 | -0.99 | -0.75 | -0.76 | 0.43 | 0.24 |
| 1 | 1 | 135 | 13 | 48 | 0.09 | -0.66 | -0.92 | -1.19 | -1.11 | -0.67 | 0.82 |
| 1 | 1 | 136 | 9 | 326 | 0.11 | 1.45 | -0.08 | 0.32 | 1.37 | -0.67 | 0.38 |
| 1 | 1 | 137 | 11 | 386 | 0.22 | 1.37 | 0.68 | -0.66 | -0.18 | 2.64 | -1.36 |
| 1 | 1 | 138 | 20 | 255 | 0.15 | 0.43 | -0.89 | -0.34 | -0.05 | 2.64 | 0.68 |
| 1 | 1 | 139 | 8 | 161 | 0.14 | -0.68 | -0.94 | -0.08 | -0.05 | 0.43 | 1.41 |
| 1 | 1 | 140 | 10 | 38 | 0.13 | -1.22 | -1.19 | -0.92 | -0.76 | 0.43 | 0.91 |
| 1 | 1 | 141 | 14 | 180 | 0.12 | 0.09 | 0.68 | -0.56 | -0.76 | -0.67 | 0.07 |
| 1 | 1 | 142 | 13 | 427 | 0.21 | 2.01 | 0.62 | 2.29 | 2.79 | 0.43 | -0.18 |
| 1 | 1 | 143 | 15 | 354 | 0.13 | 1.63 | 0.68 | 0.39 | 1.37 | -0.67 | 0.26 |
| 1 | 1 | 144 | 11 | 301 | 0.14 | 0.82 | -1.03 | 0.15 | 1.37 | -0.67 | -0.85 |
| 1 | 1 | 145 | 7 | 205 | 0.13 | 0.24 | -0.08 | -0.05 | -0.05 | -0.67 | 1.57 |
| 1 | 1 | 146 | 10 | 327 | 0.14 | 0.66 | 0.68 | 0.60 | 1.37 | -0.67 | 0.42 |
| 1 | 1 | 147 | 11 | 403 | 0.11 | 0.92 | 2.30 | 1.22 | 1.37 | 0.43 | -0.79 |
| 1 | 1 | 148 | 7 | 127 | 0.08 | -0.20 | -0.89 | -0.10 | -0.76 | -0.67 | 0.69 |
| 1 | 1 | 149 | 6 | 217 | 0.11 | 0.41 | -0.89 | 0.26 | -0.05 | -0.67 | -0.34 |
| 1 | 1 | 150 | 11 | 323 | 0.1 | 0.96 | -0.89 | 0.46 | 1.37 | 0.43 | 0.72 |
| 1 | 1 | 151 | 9 | 208 | 0.16 | 0.16 | 0.68 | -0.60 | -0.05 | -0.67 | 0.43 |
| 1 | 1 | 152 | 14 | 405 | 0.14 | -0.04 | 0.68 | 2.29 | 1.37 | 0.43 | -2.24 |
| 1 | 1 | 153 | 10 | 98 | 0.08 | -0.22 | -0.89 | -1.14 | -0.76 | -0.67 | 0.33 |
| 1 | 1 | 154 | 7 | 47 | 0.05 | -0.99 | -0.89 | -1.16 | -0.76 | -0.67 | 1.03 |
| 1 | 1 | 155 | 8 | 182 | 0.15 | -1.00 | 0.68 | 0.08 | -0.76 | -0.67 | -0.46 |
| 1 | 1 | 156 | 9 | 153 | 0.07 | -0.24 | -0.93 | 0.04 | -0.05 | -0.67 | 1.57 |
| 1 | 1 | 157 | 23 | 65 | 0.08 | -1.23 | -0.89 | -0.82 | -0.76 | -0.67 | 0.27 |
| 1 | 1 | 158 | 4 | 367 | 0.08 | 3.03 | -0.08 | 0.15 | -0.05 | -0.67 | -1.65 |
| 1 | 1 | 159 | 29 | 421 | 0.15 | 1.04 | 0.68 | 2.29 | 2.79 | 0.43 | -0.84 |
| 1 | 1 | 160 | 10 | 111 | 0.14 | -1.33 | -0.89 | 0.06 | -0.76 | -0.67 | -0.65 |
| 1 | 1 | 161 | 13 | 382 | 0.25 | 3.03 | 0.68 | 0.23 | -0.05 | -0.42 | -1.68 |
| 1 | 1 | 162 | 10 | 142 | 0.1 | -0.54 | -0.89 | -0.76 | -0.05 | -0.67 | -0.73 |
| 1 | 1 | 163 | 14 | 433 | 0.32 | 1.31 | -0.10 | 2.29 | 2.79 | 2.64 | -0.81 |
| 1 | 1 | 164 | 14 | 50 | 0.08 | -0.88 | -0.08 | -1.19 | -1.11 | -0.67 | 0.91 |
| 1 | 1 | 165 | 23 | 125 | 0.14 | -0.78 | 0.68 | -0.84 | -0.76 | -0.67 | 0.41 |
| 1 | 1 | 166 | 7 | 67 | 0.09 | -0.95 | 0.68 | -1.13 | -0.96 | -0.67 | 1.03 |
| 1 | 1 | 167 | 12 | 254 | 0.11 | -0.53 | 0.68 | 0.02 | -0.05 | 0.43 | -0.24 |
| 1 | 1 | 168 | 11 | 113 | 0.17 | -0.63 | -0.08 | -0.98 | -0.92 | -0.67 | -0.30 |
| 1 | 1 | 169 | 9 | 83 | 0.04 | -0.71 | -0.89 | -0.75 | -0.76 | -0.67 | 0.76 |
| 1 | 1 | 170 | 10 | 89 | 0.11 | -1.18 | -0.93 | -0.19 | -0.76 | -0.67 | 0.66 |
| 1 | 1 | 171 | 8 | 44 | 0.06 | -1.41 | -0.89 | -1.13 | -0.76 | -0.67 | 0.36 |
| 1 | 1 | 172 | 8 | 310 | 0.08 | 1.13 | -0.99 | 0.45 | 1.37 | -0.67 | -0.08 |
| 1 | 1 | 173 | 8 | 6 | 0.08 | -1.21 | -0.89 | -1.28 | -1.11 | -0.67 | -1.64 |
| 1 | 1 | 174 | 14 | 295 | 0.12 | -0.78 | -0.08 | 0.38 | -0.76 | 2.64 | -1.00 |
| 1 | 1 | 175 | 12 | 316 | 0.19 | 1.00 | -1.02 | -0.29 | 1.37 | -0.67 | -1.61 |
| 1 | 1 | 176 | 2 | 346 | 0.15 | 0.38 | -0.11 | 2.60 | -0.05 | 0.43 | 0.69 |
| 1 | 1 | 177 | 12 | 194 | 0.17 | -0.92 | 0.68 | -0.46 | -0.76 | 0.43 | -0.16 |
| 1 | 1 | 178 | 17 | 396 | 0.19 | 1.51 | 2.30 | 0.45 | 1.37 | -0.09 | 1.34 |
| 1 | 1 | 179 | 15 | 206 | 0.13 | -1.30 | -0.08 | 0.26 | -0.76 | 0.43 | -0.90 |
| 1 | 1 | 180 | 4 | 322 | 0.13 | 1.35 | -0.89 | -0.06 | 1.37 | 0.43 | 0.52 |
| 1 | 1 | 181 | 31 | 225 | 0.12 | -0.25 | 0.68 | 0.14 | -0.05 | -0.67 | 0.21 |
| 1 | 1 | 182 | 5 | 185 | 0.14 | 0.46 | -0.57 | -0.29 | -0.76 | 0.43 | 1.27 |
| 1 | 1 | 183 | 12 | 103 | 0.15 | -0.68 | -0.08 | -0.63 | -0.76 | -0.67 | 1.39 |
| 1 | 1 | 184 | 2 | 151 | 0.06 | -0.29 | -0.89 | 0.06 | -0.76 | -0.67 | -0.87 |
| 1 | 1 | 185 | 11 | 232 | 0.12 | -0.39 | 0.68 | 0.22 | -0.05 | -0.67 | -0.20 |
| 1 | 1 | 186 | 9 | 313 | 0.29 | 0.41 | -0.89 | 1.32 | 1.37 | -0.67 | 0.31 |
| 1 | 1 | 187 | 17 | 372 | 0.15 | 0.23 | 0.70 | 1.21 | 1.37 | 0.43 | -0.74 |
| 1 | 1 | 188 | 9 | 407 | 0.09 | 0.41 | 2.30 | 1.70 | 1.37 | 0.43 | -1.08 |
| 1 | 1 | 189 | 14 | 68 | 0.36 | -0.83 | 0.19 | -1.16 | -1.09 | 0.43 | 0.95 |
| 1 | 1 | 190 | 9 | 37 | 0.08 | -1.53 | -1.27 | -0.83 | -0.76 | -0.67 | 0.41 |
| 1 | 1 | 191 | 8 | 28 | 0.13 | -1.33 | -1.22 | -1.10 | -0.85 | -0.67 | -0.62 |
| 1 | 1 | 192 | 8 | 71 | 0.24 | -1.65 | -0.89 | -0.19 | -0.67 | -0.67 | 0.43 |
| 1 | 1 | 193 | 9 | 7 | 0.2 | -0.89 | -0.89 | -0.36 | -0.84 | 2.64 | 0.50 |
| 1 | 1 | 194 | 16 | 53 | 0.05 | -1.15 | -0.89 | -0.82 | -0.76 | -0.67 | 1.06 |
| 1 | 1 | 195 | 9 | 29 | 0.07 | -0.60 | -1.27 | -1.08 | -0.76 | -0.67 | -1.66 |
| 1 | 1 | 196 | 14 | 377 | 0.21 | 2.10 | 0.68 | 0.25 | 1.37 | 0.43 | 0.77 |
| 1 | 1 | 197 | 10 | 190 | 0.2 | -0.45 | -0.73 | -0.74 | -0.05 | 0.43 | -0.77 |
| 1 | 1 | 198 | 11 | 15 | 0.13 | -1.32 | -1.03 | -1.19 | -1.11 | 0.43 | 0.88 |
| 1 | 1 | 199 | 13 | 344 | 0.23 | 0.11 | 0.68 | 0.14 | -0.05 | 2.64 | -0.39 |
| 1 | 1 | 200 | 12 | 128 | 0.1 | 0.05 | -0.08 | -0.81 | -0.76 | -0.67 | 0.90 |
| 1 | 1 | 201 | 17 | 36 | 0.06 | -1.25 | -1.27 | -1.13 | -0.76 | -0.67 | 0.40 |
| 1 | 1 | 202 | 12 | 172 | 0.1 | -0.33 | -0.89 | -0.44 | -0.05 | 0.43 | 0.94 |
| 1 | 1 | 203 | 8 | 293 | 0.04 | -0.74 | 1.11 | 0.50 | -0.05 | 0.43 | -1.22 |
| 1 | 1 | 204 | 4 | 165 | 0.06 | 0.02 | -0.89 | -0.67 | -0.76 | 0.43 | -1.69 |
| 1 | 1 | 205 | 13 | 187 | 0.13 | 0.43 | -0.89 | -0.71 | -0.05 | -0.67 | -1.63 |
| 1 | 1 | 206 | 4 | 210 | 0.13 | 0.00 | 2.30 | -0.78 | -0.76 | 0.43 | 0.94 |
| 1 | 1 | 207 | 12 | 70 | 0.05 | -0.72 | -1.27 | -0.81 | -0.76 | -0.67 | 0.69 |
| 1 | 1 | 208 | 13 | 156 | 0.14 | 0.14 | -0.08 | -0.21 | -0.76 | -0.67 | 0.93 |
| 1 | 1 | 209 | 10 | 317 | 0.17 | 0.87 | 2.30 | 0.21 | -0.05 | -0.67 | 1.31 |
| 1 | 1 | 210 | 19 | 197 | 0.23 | -0.12 | 0.68 | -0.79 | -0.83 | 0.43 | 0.37 |
| 1 | 1 | 211 | 13 | 201 | 0.15 | 0.33 | -0.98 | -0.29 | -0.05 | -0.67 | -0.85 |
| 1 | 1 | 212 | 12 | 94 | 0.15 | -1.25 | -0.08 | -0.80 | -0.76 | -0.67 | 0.38 |
| 1 | 1 | 213 | 26 | 85 | 0.09 | -0.35 | -0.89 | -0.98 | -0.76 | -0.67 | -1.63 |
| 1 | 1 | 214 | 13 | 324 | 0.11 | 0.51 | 2.30 | 0.21 | -0.05 | 0.43 | 0.66 |
| 1 | 1 | 215 | 9 | 95 | 0.08 | -0.54 | 0.68 | -1.18 | -1.07 | -0.67 | 1.01 |
| 1 | 1 | 216 | 23 | 126 | 0.14 | -0.21 | -0.08 | -1.01 | -0.77 | -0.67 | -1.64 |
| 1 | 1 | 217 | 10 | 256 | 0.2 | 0.88 | 0.68 | -0.39 | -0.05 | -0.67 | -0.79 |
| 1 | 1 | 218 | 19 | 356 | 0.34 | -0.96 | 0.52 | 1.97 | -0.20 | -0.67 | -2.18 |
| 1 | 1 | 219 | 14 | 147 | 0.11 | 0.17 | -0.08 | -0.77 | -0.76 | -0.67 | 0.27 |
| 1 | 1 | 220 | 14 | 240 | 0.2 | 0.44 | 0.86 | 0.03 | -0.10 | -0.67 | 1.57 |
| 1 | 1 | 221 | 13 | 191 | 0.18 | 0.14 | -0.08 | -0.75 | -0.49 | 0.43 | 0.67 |
| 1 | 1 | 222 | 7 | 380 | 0.05 | 0.04 | -0.08 | 1.70 | 1.37 | 0.43 | -1.23 |
| 1 | 1 | 223 | 18 | 88 | 0.05 | -0.46 | -0.89 | -0.77 | -0.76 | -0.67 | 1.00 |
| 1 | 1 | 224 | 12 | 417 | 0.1 | 1.69 | 0.68 | 2.29 | 2.79 | -0.67 | 0.38 |
| 1 | 1 | 225 | 18 | 413 | 0.27 | 2.56 | 0.68 | -0.04 | 1.37 | 2.64 | 0.59 |
| 1 | 1 | 226 | 9 | 401 | 0.08 | 0.81 | -0.89 | 1.20 | 1.37 | 2.64 | -0.61 |
| 1 | 1 | 227 | 8 | 49 | 0.08 | -1.28 | -0.89 | -0.29 | -0.76 | -0.67 | 1.41 |
| 1 | 1 | 228 | 8 | 12 | 0.18 | -1.32 | -1.03 | -0.93 | -0.85 | 0.43 | 1.57 |
| 1 | 1 | 229 | 6 | 237 | 0.19 | 0.49 | 0.17 | -0.73 | -0.76 | 0.43 | -1.52 |
| 1 | 1 | 230 | 16 | 266 | 0.12 | -0.03 | 0.68 | 0.49 | -0.05 | 0.43 | 0.54 |
| 1 | 1 | 231 | 5 | 82 | 0.12 | -1.25 | -1.12 | -0.29 | -0.05 | -0.67 | 1.27 |
| 1 | 1 | 232 | 24 | 110 | 0.12 | -0.42 | -1.00 | -0.71 | -0.76 | -0.67 | -0.07 |
| 1 | 1 | 233 | 14 | 270 | 0.15 | 0.34 | -1.11 | 0.45 | 1.37 | -0.67 | 1.39 |
| 1 | 1 | 234 | 4 | 200 | 0.07 | -0.30 | -0.89 | 0.26 | -0.05 | 0.43 | 1.57 |
| 1 | 1 | 235 | 16 | 263 | 0.29 | 0.70 | 0.35 | -0.52 | -0.18 | 0.43 | -0.62 |
| 1 | 1 | 236 | 12 | 280 | 0.21 | 1.35 | 0.68 | -0.41 | -0.23 | 0.43 | 0.90 |
| 1 | 1 | 237 | 8 | 281 | 0.07 | -0.96 | 0.68 | 0.48 | -0.05 | 0.43 | -1.23 |
| 1 | 1 | 238 | 7 | 287 | 0.13 | 0.20 | 0.68 | 0.45 | -0.05 | 0.43 | -0.55 |
| 1 | 1 | 239 | 16 | 116 | 0.18 | -1.23 | -0.08 | -0.57 | -0.76 | -0.67 | -0.37 |
| 1 | 1 | 240 | 31 | 321 | 0.42 | -0.50 | 2.30 | 0.40 | -0.12 | -0.03 | -1.00 |
| 1 | 1 | 241 | 16 | 420 | 0.13 | 1.51 | -0.08 | 2.29 | 2.79 | 0.43 | -0.49 |
| 1 | 1 | 242 | 8 | 199 | 0.12 | 0.05 | -1.03 | 0.20 | -0.05 | -0.67 | -0.07 |
| 1 | 1 | 243 | 8 | 314 | 0.07 | 0.99 | -0.08 | 0.45 | 1.37 | -0.67 | 0.72 |
| 1 | 1 | 244 | 7 | 289 | 0.06 | -0.20 | 0.68 | 1.23 | -0.05 | 0.43 | 0.35 |
| 1 | 1 | 245 | 15 | 216 | 0.16 | -0.16 | 0.68 | 0.03 | -0.05 | -0.67 | 0.76 |
| 1 | 1 | 246 | 14 | 273 | 0.16 | 0.31 | 0.63 | 0.14 | -0.05 | 0.43 | 0.07 |
| 1 | 1 | 247 | 20 | 132 | 0.18 | -0.76 | -0.08 | -0.14 | -0.72 | -0.67 | 0.74 |
| 1 | 1 | 248 | 5 | 369 | 0.08 | -0.74 | 0.68 | 1.70 | -0.05 | 0.43 | -2.35 |
| 1 | 1 | 249 | 11 | 432 | 0.33 | 0.40 | 1.35 | 2.13 | 1.37 | 2.64 | -2.28 |
| 1 | 1 | 250 | 6 | 341 | 0.22 | 0.46 | 0.62 | 0.21 | 1.37 | 0.43 | 1.36 |
| 1 | 1 | 251 | 24 | 404 | 0.12 | 1.30 | -0.89 | 2.29 | 2.79 | -0.67 | 0.33 |
| 1 | 1 | 252 | 9 | 63 | 0.05 | -1.23 | -0.89 | -0.75 | -0.76 | -0.67 | 0.65 |
| 1 | 1 | 253 | 11 | 226 | 0.16 | 0.32 | -0.96 | -0.44 | -0.05 | 0.43 | -0.89 |
| 1 | 1 | 254 | 11 | 133 | 0.09 | -0.43 | -0.89 | -0.75 | -0.05 | -0.67 | 0.33 |
| 1 | 1 | 255 | 19 | 376 | 0.16 | 0.60 | 0.68 | 1.17 | 1.37 | 0.43 | -0.70 |
| 1 | 1 | 256 | 8 | 80 | 0.06 | -0.23 | -0.89 | -1.15 | -0.76 | -0.67 | 0.91 |
| 1 | 1 | 257 | 13 | 164 | 0.27 | -0.58 | 0.15 | -0.89 | -0.87 | 0.43 | -0.58 |
| 1 | 1 | 258 | 7 | 78 | 0.07 | -0.82 | -0.89 | -0.29 | -0.76 | -0.67 | 1.33 |
| 1 | 1 | 259 | 5 | 422 | 0.05 | 1.83 | -0.89 | 2.29 | 2.79 | 0.43 | -0.23 |
| 1 | 1 | 260 | 10 | 23 | 0.11 | -1.45 | -1.23 | -1.16 | -0.83 | -0.67 | -0.10 |
| 1 | 1 | 261 | 8 | 60 | 0.05 | -0.57 | -1.27 | -1.14 | -0.76 | -0.67 | 0.44 |
| 1 | 1 | 262 | 12 | 93 | 0.12 | -0.88 | -0.92 | -0.78 | -0.76 | -0.67 | -0.41 |
| 1 | 1 | 263 | 20 | 279 | 0.24 | 0.33 | 2.30 | 0.14 | -0.19 | -0.67 | 0.31 |
| 1 | 1 | 264 | 15 | 75 | 0.05 | -0.67 | -0.89 | -1.10 | -0.76 | -0.67 | 0.39 |
| 1 | 1 | 265 | 15 | 351 | 0.04 | 0.27 | -0.89 | 1.20 | 1.37 | 0.43 | -0.45 |
| 1 | 1 | 266 | 7 | 408 | 0.25 | 2.71 | -0.89 | -0.27 | 1.37 | 2.64 | 0.06 |
| 1 | 1 | 267 | 10 | 57 | 0.07 | -0.55 | -1.27 | -1.15 | -0.76 | -0.67 | 0.82 |
| 1 | 1 | 268 | 13 | 387 | 0.24 | 0.11 | 0.68 | 1.48 | -0.05 | 2.64 | -0.61 |
| 1 | 1 | 269 | 6 | 436 | 0.27 | 1.23 | 2.30 | 1.91 | 2.08 | 2.64 | -1.04 |
| 1 | 1 | 270 | 11 | 202 | 0.09 | 0.20 | -0.89 | 0.47 | -0.05 | -0.67 | 0.68 |
| 1 | 1 | 271 | 12 | 402 | 0.47 | 1.54 | 0.81 | 2.42 | 1.37 | 0.34 | 0.54 |
| 1 | 1 | 272 | 12 | 152 | 0.15 | -0.05 | -0.08 | -0.83 | -0.76 | -0.67 | -0.87 |
| 1 | 1 | 273 | 9 | 21 | 0.09 | -1.48 | -0.89 | -1.15 | -0.96 | -0.67 | 0.95 |
| 1 | 1 | 274 | 11 | 69 | 0.05 | -0.74 | -0.89 | -0.84 | -0.76 | -0.67 | 1.11 |
| 1 | 1 | 275 | 14 | 249 | 0.24 | -0.06 | -0.77 | -0.01 | -0.15 | 2.64 | 0.62 |
| 1 | 1 | 276 | 4 | 318 | 0.08 | -1.02 | 0.68 | 1.35 | -0.05 | 0.43 | -1.21 |
| 1 | 1 | 277 | 19 | 247 | 0.15 | 0.09 | 0.68 | -0.36 | -0.05 | 0.43 | 0.65 |
| 1 | 1 | 278 | 14 | 297 | 0.17 | 1.23 | 0.68 | 0.23 | -0.05 | 0.43 | 0.58 |
| 1 | 1 | 279 | 8 | 338 | 0.08 | 0.88 | 0.68 | 0.45 | 1.37 | -0.67 | 1.41 |
| 1 | 1 | 280 | 13 | 171 | 0.1 | -0.38 | -0.92 | 0.00 | -0.05 | -0.67 | 0.06 |
| 1 | 1 | 281 | 14 | 371 | 0.28 | -0.93 | 2.30 | 0.46 | -0.20 | -0.12 | -2.31 |
| 1 | 1 | 282 | 24 | 173 | 0.33 | -0.28 | 0.73 | -0.75 | -0.82 | 0.43 | 0.97 |
| 1 | 1 | 283 | 18 | 355 | 0.14 | 1.38 | 0.68 | -0.21 | 1.37 | 0.43 | 0.42 |
| 1 | 1 | 284 | 4 | 345 | 0.11 | 0.83 | -0.49 | -0.66 | 1.37 | 0.43 | -1.59 |
| 1 | 1 | 285 | 11 | 56 | 0.05 | -0.83 | -1.27 | -0.78 | -0.76 | -0.67 | 1.01 |
| 1 | 1 | 286 | 7 | 370 | 0.19 | 1.55 | 0.68 | -0.38 | 1.37 | 0.43 | -1.02 |
| 1 | 1 | 287 | 10 | 64 | 0.06 | -0.48 | -0.89 | -1.19 | -1.11 | -0.67 | 0.41 |
| 1 | 1 | 288 | 9 | 167 | 0.15 | -0.36 | -0.89 | -0.71 | -0.05 | 0.43 | 0.29 |
| 1 | 1 | 289 | 6 | 124 | 0.15 | -1.28 | -1.02 | -0.29 | -0.05 | -0.67 | -0.30 |
| 1 | 1 | 290 | 14 | 272 | 0.11 | -0.70 | 0.68 | 0.43 | -0.05 | 0.43 | -0.78 |
| 1 | 1 | 291 | 25 | 188 | 0.09 | 0.21 | -0.89 | 0.05 | -0.05 | -0.67 | 0.52 |
| 1 | 1 | 292 | 23 | 435 | 0.3 | 1.05 | 2.30 | 2.59 | 2.79 | 0.43 | -1.56 |
| 1 | 1 | 293 | 8 | 8 | 0.08 | -1.66 | -1.27 | -1.18 | -0.94 | -0.67 | 1.07 |
| 1 | 1 | 294 | 14 | 431 | 0.2 | 2.14 | 2.30 | 2.29 | 2.79 | -0.60 | 0.40 |
| 1 | 1 | 295 | 3 | 96 | 0.08 | -0.16 | -0.89 | -0.58 | -0.76 | -0.67 | 1.57 |
| 1 | 1 | 296 | 9 | 87 | 0.06 | -0.46 | -1.27 | -0.78 | -0.76 | -0.67 | 0.39 |
| 1 | 1 | 297 | 6 | 414 | 0.04 | 0.90 | -0.89 | 2.29 | 2.79 | 0.43 | -0.85 |
| 1 | 1 | 298 | 10 | 230 | 0.18 | 0.31 | -0.89 | 0.32 | -0.12 | 0.43 | 0.71 |
| 1 | 1 | 299 | 13 | 58 | 0.11 | -0.84 | -0.95 | -1.16 | -0.98 | -0.67 | -0.08 |
| 1 | 1 | 300 | 11 | 4 | 0.31 | 0.63 | -1.03 | -0.85 | -0.76 | 2.64 | 0.96 |
| 1 | 1 | 301 | 24 | 277 | 0.18 | 0.59 | 0.68 | 0.21 | -0.08 | 0.43 | 0.76 |
| 1 | 1 | 302 | 13 | 130 | 0.1 | -0.85 | -0.08 | -0.61 | -0.76 | -0.67 | -0.80 |
| 1 | 1 | 303 | 6 | 131 | 0.17 | -0.53 | 0.68 | -0.61 | -0.52 | -0.67 | 1.57 |
| 1 | 1 | 304 | 10 | 55 | 0.09 | -1.09 | -1.27 | -0.81 | -0.76 | -0.67 | 0.53 |
| 1 | 1 | 305 | 21 | 269 | 0.11 | -1.16 | 0.68 | 0.48 | -0.05 | -0.67 | -2.25 |
| 1 | 1 | 306 | 21 | 91 | 0.1 | -0.45 | -0.08 | -1.19 | -1.11 | -0.67 | 0.57 |
| 1 | 1 | 307 | 9 | 104 | 0.08 | -0.46 | -0.98 | -0.87 | -0.76 | -0.67 | -1.12 |
| 1 | 1 | 308 | 19 | 32 | 0.07 | -1.02 | -0.89 | -1.19 | -1.11 | -0.67 | 0.97 |
| 1 | 1 | 309 | 4 | 268 | 0.13 | 0.22 | -0.89 | 0.07 | -0.05 | 2.64 | 1.46 |
| 1 | 1 | 310 | 4 | 59 | 0.03 | -0.93 | -0.89 | -1.14 | -0.76 | -0.67 | 0.69 |
| 1 | 1 | 311 | 4 | 2 | 0.12 | -0.52 | -0.89 | -0.66 | -0.76 | 2.64 | 1.22 |
| 1 | 1 | 312 | 8 | 3 | 0.15 | -1.66 | -1.22 | -1.08 | -0.89 | -0.67 | 1.57 |
| 1 | 1 | 313 | 8 | 141 | 0.14 | -0.61 | -0.89 | -0.21 | -0.76 | 0.43 | 0.62 |
| 1 | 1 | 314 | 14 | 183 | 0.08 | -0.35 | -0.89 | 0.45 | -0.05 | -0.67 | 0.46 |
| 1 | 1 | 315 | 6 | 192 | 0.15 | 0.62 | -0.89 | -0.58 | -0.76 | 0.43 | -0.26 |
| 1 | 1 | 316 | 6 | 196 | 0.17 | -0.81 | -0.76 | -0.07 | -0.05 | 0.43 | 0.02 |
| 1 | 1 | 317 | 15 | 92 | 0.04 | -0.75 | -0.89 | -0.75 | -0.76 | -0.67 | 0.36 |
| 1 | 1 | 318 | 6 | 214 | 0.21 | 1.06 | 0.17 | -0.90 | -0.64 | -0.67 | -1.66 |
| 1 | 1 | 319 | 9 | 389 | 0.05 | 0.14 | 0.68 | 1.70 | 1.37 | 0.43 | -1.23 |
| 1 | 1 | 320 | 8 | 381 | 0.26 | -0.31 | 2.30 | 1.86 | -0.05 | -0.12 | -0.99 |
| 1 | 1 | 321 | 2 | 438 | 0.38 | 2.15 | 1.49 | 6.57 | 1.37 | 0.43 | -0.28 |
| 1 | 1 | 322 | 18 | 224 | 0.09 | -0.62 | 0.68 | 0.03 | -0.05 | -0.67 | -0.65 |
| 1 | 1 | 323 | 7 | 361 | 0.05 | 0.63 | -0.89 | 1.20 | 1.37 | 0.43 | -0.72 |
| 1 | 1 | 324 | 8 | 352 | 0.12 | 0.50 | -0.89 | 1.20 | 1.37 | 0.43 | -0.08 |
| 1 | 1 | 325 | 8 | 234 | 0.1 | -0.43 | 0.68 | -0.12 | -0.05 | 0.43 | 1.46 |
| 1 | 1 | 326 | 4 | 17 | 0.04 | -1.04 | -0.08 | -1.19 | -1.11 | -0.67 | 1.57 |
| 1 | 1 | 327 | 3 | 5 | 0.04 | -2.07 | -0.89 | -1.18 | -1.11 | -0.67 | 0.90 |
| 1 | 1 | 328 | 9 | 304 | 0.1 | -0.22 | 0.68 | 1.23 | -0.05 | 0.43 | -0.46 |
| 1 | 1 | 329 | 9 | 350 | 0.09 | -1.17 | 2.30 | 0.45 | -0.76 | -0.67 | -2.26 |
| 1 | 1 | 330 | 12 | 294 | 0.1 | 0.98 | -0.92 | 0.45 | 1.37 | -0.67 | 0.42 |
| 1 | 1 | 331 | 9 | 307 | 0.16 | 0.42 | -1.14 | 0.45 | 1.37 | 0.43 | 1.02 |
| 1 | 1 | 332 | 9 | 439 | 0.49 | 1.48 | 1.28 | 4.62 | 3.89 | 0.92 | -2.46 |
| 1 | 1 | 333 | 10 | 148 | 0.16 | -1.03 | -0.57 | -0.22 | -0.76 | 0.43 | -0.12 |
| 1 | 1 | 334 | 6 | 424 | 0.25 | 1.52 | 2.30 | 0.32 | 1.37 | 2.64 | 0.77 |
| 1 | 1 | 335 | 19 | 52 | 0.13 | -1.61 | -0.91 | -0.75 | -0.78 | -0.67 | -0.31 |
| 1 | 1 | 336 | 12 | 31 | 0.05 | -1.28 | -0.89 | -1.19 | -1.11 | -0.67 | 0.60 |
| 1 | 1 | 337 | 19 | 123 | 0.2 | -1.52 | -0.08 | -0.20 | -0.76 | -0.67 | -0.94 |
| 1 | 1 | 338 | 11 | 218 | 0.17 | -1.37 | 0.54 | 0.45 | -0.76 | -0.67 | -2.18 |
| 1 | 1 | 339 | 18 | 262 | 0.2 | 0.88 | 0.68 | -0.51 | -0.17 | 0.43 | 0.51 |
| 1 | 1 | 340 | 3 | 325 | 0.07 | -1.02 | 0.68 | 0.58 | -0.05 | 0.43 | -2.46 |
| 1 | 1 | 341 | 7 | 212 | 0.1 | -1.39 | 0.68 | -0.26 | -0.76 | 0.43 | -1.09 |
| 1 | 1 | 342 | 12 | 41 | 0.1 | -1.75 | -0.92 | -0.62 | -0.76 | -0.67 | -0.88 |
| 1 | 1 | 343 | 7 | 290 | 0.08 | 0.87 | -1.27 | 0.46 | 1.37 | -0.67 | 0.86 |
| 1 | 1 | 344 | 8 | 139 | 0.12 | -1.50 | -0.89 | 0.01 | -0.76 | 0.43 | -0.63 |
| 1 | 1 | 345 | 14 | 360 | 0.27 | 2.68 | 0.68 | 0.44 | -0.25 | 0.43 | 0.22 |
| 1 | 1 | 346 | 6 | 181 | 0.05 | -0.41 | -0.89 | -0.29 | -0.05 | 0.43 | 0.55 |
| 1 | 1 | 347 | 13 | 353 | 0.11 | 1.54 | 0.68 | 0.47 | 1.37 | -0.67 | 0.87 |
| 1 | 1 | 348 | 8 | 349 | 0.13 | 1.34 | 0.30 | 0.15 | 1.37 | -0.67 | -0.87 |
| 1 | 1 | 349 | 6 | 61 | 0.04 | -0.65 | -0.89 | -1.14 | -0.76 | -0.67 | 0.94 |
| 1 | 1 | 350 | 21 | 336 | 0.25 | 0.45 | 0.68 | 0.04 | -0.12 | 2.64 | 0.86 |
| 1 | 1 | 351 | 12 | 235 | 0.26 | -0.42 | 0.36 | -0.39 | -0.76 | 2.64 | 0.12 |
| 1 | 1 | 352 | 10 | 115 | 0.16 | -1.06 | -1.08 | -0.38 | -0.05 | -0.67 | 0.60 |
| 1 | 1 | 353 | 5 | 339 | 0.15 | -0.28 | 0.22 | 2.29 | -0.05 | 0.43 | -0.55 |
| 1 | 1 | 354 | 16 | 397 | 0.34 | -0.72 | 2.30 | 1.86 | -0.14 | -0.26 | -2.30 |
| 1 | 1 | 355 | 4 | 236 | 0.15 | -0.45 | 0.68 | -0.08 | -0.05 | -0.67 | -1.43 |
| 1 | 1 | 356 | 11 | 160 | 0.1 | -0.31 | -1.27 | 0.12 | -0.05 | -0.67 | 0.58 |
| 1 | 1 | 357 | 15 | 10 | 0.12 | -1.28 | -1.27 | -1.05 | -0.83 | -0.67 | -1.64 |
| 1 | 1 | 358 | 4 | 392 | 0.2 | 0.97 | 2.30 | 0.54 | 1.37 | 0.43 | 0.28 |
| 1 | 1 | 359 | 10 | 144 | 0.11 | 0.46 | -0.89 | -0.29 | -0.76 | -0.67 | 0.75 |
| 1 | 1 | 360 | 7 | 213 | 0.07 | -0.32 | -0.89 | 0.48 | -0.05 | 0.43 | 0.59 |
| 1 | 1 | 361 | 9 | 394 | 0.28 | 2.75 | 0.68 | -0.09 | -0.29 | 2.64 | 0.44 |
| 1 | 1 | 362 | 7 | 340 | 0.2 | 0.97 | -1.00 | 0.03 | 1.37 | 0.43 | -0.83 |
| 1 | 1 | 363 | 21 | 251 | 0.24 | 1.15 | 0.68 | -0.21 | -0.12 | -0.67 | 0.79 |
| 1 | 1 | 364 | 11 | 118 | 0.1 | 0.14 | -0.89 | -0.84 | -0.76 | -0.67 | 0.42 |
| 1 | 1 | 365 | 12 | 99 | 0.04 | -0.35 | -0.89 | -0.79 | -0.76 | -0.67 | 0.76 |
| 1 | 1 | 366 | 8 | 388 | 0.15 | 1.15 | 2.30 | 0.73 | 1.37 | -0.67 | 0.52 |
| 1 | 1 | 367 | 8 | 260 | 0.14 | 0.41 | 0.68 | 0.40 | -0.05 | -0.67 | -0.36 |
| 1 | 1 | 368 | 12 | 117 | 0.09 | -0.93 | -0.89 | -0.04 | -0.76 | -0.67 | 0.01 |
| 1 | 1 | 369 | 12 | 303 | 0.05 | -0.52 | 0.68 | 1.23 | -0.05 | 0.43 | -0.83 |
| 1 | 1 | 370 | 7 | 395 | 0.21 | 1.06 | 2.30 | 0.16 | -0.05 | 2.64 | 0.75 |
| 1 | 1 | 371 | 17 | 81 | 0.13 | -1.49 | -0.91 | -0.30 | -0.76 | -0.67 | -0.20 |
| 1 | 1 | 372 | 10 | 209 | 0.28 | 0.96 | 0.60 | -0.64 | -0.76 | -0.67 | -0.04 |
| 1 | 1 | 373 | 17 | 66 | 0.08 | -1.33 | -0.89 | -0.81 | -0.76 | -0.67 | -0.51 |
| 1 | 1 | 374 | 14 | 136 | 0.12 | -0.28 | -0.92 | -0.50 | -0.05 | -0.67 | 1.05 |
| 1 | 1 | 375 | 20 | 198 | 0.09 | -0.42 | -0.89 | 0.45 | -0.05 | -0.67 | -0.41 |
| 1 | 1 | 376 | 16 | 26 | 0.08 | -1.40 | -1.27 | -1.15 | -0.76 | -0.67 | 0.81 |
| 1 | 1 | 377 | 7 | 343 | 0.22 | 0.87 | -0.89 | -0.65 | -0.35 | 2.64 | -1.45 |
| 1 | 1 | 378 | 10 | 39 | 0.06 | -1.04 | -1.27 | -1.11 | -0.76 | -0.67 | 0.91 |
| 1 | 1 | 379 | 8 | 229 | 0.09 | 0.17 | 0.68 | -0.63 | -0.05 | -0.67 | -0.85 |
| 1 | 1 | 380 | 15 | 175 | 0.14 | 0.41 | -0.89 | -0.46 | -0.05 | -0.67 | 0.41 |
| 1 | 1 | 381 | 13 | 35 | 0.07 | -1.28 | -1.27 | -0.82 | -0.76 | -0.67 | 1.03 |
| 1 | 1 | 382 | 14 | 220 | 0.09 | -0.81 | -0.08 | 0.45 | -0.05 | -0.67 | -0.96 |
| 1 | 1 | 383 | 11 | 283 | 0.12 | 0.38 | 2.30 | -0.05 | -0.05 | -0.67 | 1.46 |
| 1 | 1 | 384 | 10 | 134 | 0.12 | -0.09 | 1.02 | -0.78 | -0.76 | -0.67 | 1.36 |
| 1 | 1 | 385 | 14 | 416 | 0.08 | 1.33 | -0.89 | 2.29 | 2.79 | 0.43 | -0.30 |
| 1 | 1 | 386 | 15 | 76 | 0.18 | -1.35 | -0.94 | -0.63 | -0.81 | 0.43 | 0.43 |
| 1 | 1 | 387 | 23 | 157 | 0.08 | -0.37 | -0.89 | -0.04 | -0.05 | -0.67 | 0.67 |
| 1 | 1 | 388 | 4 | 391 | 0.26 | 1.95 | 0.49 | 0.42 | -0.05 | 2.64 | 0.27 |
| 1 | 1 | 389 | 6 | 105 | 0.08 | -0.68 | 0.68 | -1.25 | -1.11 | -0.67 | -1.44 |
| 1 | 1 | 390 | 8 | 315 | 0.16 | 0.60 | -1.08 | 0.37 | 1.37 | 0.43 | 0.25 |
| 1 | 1 | 391 | 14 | 72 | 0.04 | -0.78 | -0.89 | -0.76 | -0.76 | -0.67 | 0.96 |
| 1 | 1 | 392 | 8 | 259 | 0.15 | 0.35 | 0.79 | -0.12 | -0.05 | 0.43 | 1.28 |
| 1 | 1 | 393 | 18 | 222 | 0.1 | -0.38 | -0.08 | 0.45 | -0.05 | -0.67 | -0.53 |
| 1 | 1 | 394 | 11 | 333 | 0.06 | 1.49 | -0.08 | 0.46 | 1.37 | -0.67 | 0.89 |
| 1 | 1 | 395 | 15 | 276 | 0.1 | -0.28 | 0.68 | 0.32 | -0.05 | 0.43 | -0.74 |
| 1 | 1 | 396 | 9 | 16 | 0.11 | -1.53 | -1.27 | -1.19 | -1.03 | -0.67 | 0.53 |
| 1 | 1 | 397 | 9 | 364 | 0.08 | 0.27 | -0.08 | 1.20 | 1.37 | 0.43 | -0.75 |
| 1 | 1 | 398 | 10 | 211 | 0.14 | 0.00 | -1.00 | -0.05 | -0.05 | 0.43 | 0.18 |
| 1 | 1 | 399 | 7 | 384 | 0.25 | 0.30 | 0.74 | 1.76 | 1.37 | 0.43 | -0.80 |
| 1 | 1 | 400 | 6 | 177 | 0.14 | 1.15 | -0.08 | -0.61 | -0.76 | -0.67 | 0.67 |
| 1 | 1 | 401 | 19 | 203 | 0.11 | -0.13 | -1.01 | 0.05 | -0.05 | 0.43 | 0.76 |
| 1 | 1 | 402 | 8 | 312 | 0.09 | 0.73 | -0.08 | 0.45 | 1.37 | -0.67 | 1.41 |
| 1 | 1 | 403 | 9 | 258 | 0.2 | -0.18 | -0.89 | 2.29 | -0.05 | -0.67 | -0.38 |
| 1 | 1 | 404 | 11 | 244 | 0.16 | -1.09 | 0.68 | 0.39 | -0.76 | 0.43 | -0.94 |
| 1 | 1 | 405 | 11 | 319 | 0.21 | 0.03 | 0.54 | 2.29 | -0.05 | -0.67 | -0.36 |
| 1 | 1 | 406 | 15 | 159 | 0.08 | -0.10 | -1.27 | 0.08 | -0.05 | -0.67 | 0.91 |
| 1 | 1 | 407 | 33 | 221 | 0.14 | 0.64 | -0.08 | -0.05 | -0.05 | -0.67 | 0.68 |
| 1 | 1 | 408 | 11 | 112 | 0.09 | -0.50 | 0.68 | -1.18 | -1.08 | -0.67 | 0.48 |
| 1 | 1 | 409 | 17 | 399 | 0.31 | 1.42 | 0.73 | 0.26 | 1.37 | 2.64 | 0.75 |
| 1 | 1 | 410 | 5 | 20 | 0.08 | -1.28 | -1.27 | -1.05 | -0.90 | -0.67 | -1.12 |
| 1 | 1 | 411 | 6 | 34 | 0.08 | -0.85 | -0.08 | -1.25 | -1.11 | -0.67 | -1.44 |
| 1 | 1 | 412 | 6 | 426 | 0.2 | 3.36 | 0.68 | -0.27 | 1.37 | 2.64 | -0.22 |
| 1 | 1 | 413 | 12 | 285 | 0.1 | 0.72 | -1.24 | 0.30 | 1.37 | -0.67 | 0.32 |
| 1 | 1 | 414 | 8 | 363 | 0.08 | 1.28 | 1.11 | 0.45 | 1.37 | -0.67 | 1.36 |
| 1 | 1 | 415 | 8 | 437 | 0.27 | 1.02 | 1.00 | 3.07 | 2.79 | 2.64 | -2.30 |
| 1 | 1 | 416 | 7 | 284 | 0.19 | 0.20 | -0.66 | -0.74 | -0.05 | 2.64 | -0.71 |
| 1 | 1 | 417 | 9 | 264 | 0.06 | 0.29 | -0.89 | 0.45 | 1.37 | -0.67 | 0.66 |
| 1 | 1 | 418 | 21 | 238 | 0.21 | 0.24 | -0.08 | -0.09 | -0.05 | 0.43 | 0.69 |
| 1 | 1 | 419 | 3 | 296 | 0.02 | -0.47 | 0.68 | 0.03 | -0.05 | 0.43 | -2.34 |
| 1 | 1 | 420 | 10 | 100 | 0.16 | -0.44 | -0.97 | -0.87 | -0.76 | 0.43 | 1.21 |
| 1 | 1 | 421 | 11 | 174 | 0.15 | -0.61 | 0.68 | -0.02 | -0.76 | -0.67 | 0.17 |
| 1 | 1 | 422 | 9 | 311 | 0.25 | 1.52 | -0.53 | -0.55 | -0.76 | 2.64 | 0.36 |
| 1 | 1 | 423 | 6 | 135 | 0.09 | -0.58 | -1.27 | 0.12 | -0.05 | -0.67 | 1.29 |
| 1 | 1 | 424 | 5 | 368 | 0.15 | 2.37 | 0.68 | 0.36 | 1.37 | -0.67 | 0.68 |
| 1 | 1 | 425 | 9 | 373 | 0.12 | 1.65 | 0.68 | 0.39 | 1.37 | 0.43 | -0.01 |
| 1 | 1 | 426 | 10 | 228 | 0.13 | 0.42 | -0.08 | -0.33 | -0.05 | -0.67 | -0.92 |
| 1 | 1 | 427 | 20 | 86 | 0.22 | -1.23 | -0.93 | -0.85 | -0.83 | 0.43 | -0.46 |
| 1 | 1 | 428 | 7 | 106 | 0.06 | -0.07 | -0.89 | -0.76 | -0.76 | -0.67 | 0.98 |
| 1 | 1 | 429 | 8 | 138 | 0.12 | -0.45 | -0.99 | -0.78 | -0.76 | 0.43 | -0.77 |
| 1 | 1 | 430 | 8 | 30 | 0.12 | -0.51 | 1.05 | -1.19 | -1.11 | -0.67 | 1.36 |
| 1 | 1 | 431 | 19 | 357 | 0.16 | 1.03 | 0.68 | -0.27 | -0.05 | 2.64 | 0.61 |
| 1 | 1 | 432 | 20 | 406 | 0.38 | 0.02 | 2.30 | 0.99 | -0.05 | 2.64 | -1.12 |
| 1 | 1 | 433 | 10 | 302 | 0.26 | 1.83 | 0.53 | -0.09 | -0.33 | 0.43 | 0.24 |
| 1 | 1 | 434 | 5 | 73 | 0.04 | -0.72 | -0.89 | -0.95 | -0.76 | -0.67 | 0.82 |
| 1 | 1 | 435 | 1 | 440 | 0 | 5.51 | 0.68 | 3.07 | 4.20 | 2.64 | 0.50 |
| 1 | 1 | 436 | 14 | 300 | 0.29 | 2.04 | 0.68 | 0.38 | -0.15 | -0.67 | 0.69 |
| 1 | 1 | 437 | 19 | 74 | 0.1 | -1.10 | -0.89 | -0.81 | -0.76 | -0.67 | -0.90 |
| 1 | 1 | 438 | 7 | 347 | 0.13 | 1.21 | -0.08 | 0.41 | 1.37 | 0.43 | 0.59 |
| 1 | 1 | 439 | 31 | 374 | 0.28 | -0.27 | 0.87 | 0.49 | -0.05 | 2.64 | -1.24 |
| 1 | 1 | 440 | 17 | 233 | 0.2 | -0.48 | 0.68 | -0.10 | -0.13 | 0.43 | 0.60 |
Now let us understand what each column in the above summary table means:
Segment.Level - Layers of the cell.
In this case, we have performed Vector Quantization for depth 1. Hence
Segment Level is 1
Segment.Parent - Parent segment of
the cell
Segment.Child (Cell.Number) - The
children of a particular cell. In this case, it is the total number of
cells at which we achieved the defined compression percentage
n - No of points in each
cell
Cell.ID - Cell_ID’s are generated
for the multivariate data using 1-D Sammon’s Projection
algorithm
Quant.Error - Quantization Error
for each cell
All the columns after this will contain centroids for each cell. They can also be called a codebook, which represents a collection of all centroids or codewords.
For more detailed information on Data Projection please refer to section 3 of this vignette.
lets view the projected 2D centroids after performing sammon’s projection on the compressed data recieved after performing vector quantization. For the sake of brevity we are displaying first six rows.
hvt_torus_coordinates <-hvt.results[[2]][[1]][["1"]]
centroids <<- list()
coordinates_value <- lapply(1:length(hvt_torus_coordinates), function(x){
centroids <-hvt_torus_coordinates[[x]]
coordinates <- centroids$pt
})
centroid_coordinates<<- do.call(rbind.data.frame, coordinates_value)
colnames(centroid_coordinates) <- c("x_coord","y_coord")
centroid_coordinates$Row.No <- as.numeric(row.names(centroid_coordinates))
centroid_coordinates <- centroid_coordinates %>% dplyr::select(Row.No,x_coord,y_coord)
centroid_coordinates <- centroid_coordinates %>% data.frame() %>% round(4)
Table(head(centroid_coordinates))| Row.No | x_coord | y_coord |
|---|---|---|
| 1 | -20.4512 | 11.0302 |
| 2 | -15.0311 | 2.3593 |
| 3 | -1.0211 | -8.5464 |
| 4 | -21.5262 | -0.3276 |
| 5 | -11.0535 | -5.1817 |
| 6 | -10.2188 | -11.7803 |
Lets visualize the projected Sammons 2D onto a plane.
# Assuming your sammons_data is a dataframe with columns "x" and "y"
ggplot(centroid_coordinates, aes(x_coord, y_coord)) +
geom_point(color = "blue") +
labs(x = "X", y = "Y")Figure 14: Sammons 2D Plot for 440 cells
For more detailed information on voronoi tessellation please refer to section 4 of this vignette.
Now, we have obtained the centroid coordinates resulting from the application of Sammon’s projection.
For better visualisation, let’s plot the Voronoi tessellation using
the plotHVT function.
# Voronoi tessellation plot for level one
muHVT::plotHVT(hvt.results,
line.width = c(0.2),
color.vec = c("#141B41"),
centroid.size = 0.01, #1.5
maxDepth = 1)Figure 15: The Voronoi Tessellation for layer 1 shown for the 440 cells in the dataset ’computers’
Heat Maps
Now let’s plot the Voronoi Tessellation with the heatmap overlaid for all the features in the computers dataset for better visualization.
The heatmaps displayed below provides a visual representation of the spatial characteristics of the computers data, allowing us to observe patterns and trends in the distribution of each of the features (n,price,speed,hd,ram,screen,ads). The sheer green shades highlight regions with higher values in each of the heatmaps, while the indigo shades indicate areas with the lowest values in each of the heatmaps. By analyzing these heatmaps, we can gain insights into the variations and relationships between each of these features within the computers data
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "n",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.01,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 16: The Voronoi Tessellation with the heat map overlaid over the No. of entities in each cell in the ’computers’ dataset
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "price",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.01,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 17: The Voronoi Tessellation with the heat map overlaid over the variable price in the ’computers’ dataset
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "hd",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.01,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 18: The Voronoi Tessellation with the heat map overlaid over the variable hd in the ’computers’ dataset
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "ram",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.01,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 19: The Voronoi Tessellation with the heat map overlaid over the variable ram in the ’computers’ dataset
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "screen",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.01,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 20: The Voronoi Tessellation with the heat map overlaid over the variable screen in the ’computers’ dataset
muHVT::hvtHmap(
hvt.results,
trainComputers,
child.level = 1,
hmap.cols = "ads",
line.width = c(0.2),
color.vec = c("#141B41"),
palette.color = 6,
centroid.size = 0.01,
show.points = T,
quant.error.hmap = 0.2,
n_cells.hmap = 15
)Figure 21: The Voronoi Tessellation with the heat map overlaid over the variable ads in the ’computers’ dataset
For more detailed information on prediction please refer to section 5 of this vignette.
Raw Testing Dataset
Now, lets have a look at the randomly selected raw testing dataset containing (1252 data points) before we pass it to predictHVT function for scoring. For the sake of brevity we are displaying first six rows.
Table(head(testComputers_data))| Row.No | price | speed | hd | ram | screen | ads |
|---|---|---|---|---|---|---|
| 3 | 1595 | 25 | 170 | 4 | 15 | 94 |
| 4 | 1849 | 25 | 170 | 8 | 14 | 94 |
| 7 | 1720 | 25 | 170 | 4 | 14 | 94 |
| 10 | 2575 | 50 | 210 | 4 | 15 | 94 |
| 11 | 2195 | 33 | 170 | 8 | 15 | 94 |
| 14 | 2295 | 25 | 245 | 8 | 14 | 94 |
Now once we have built the model, let us try to predict using our test dataset which cell and which level each point belongs to.
predictHVT(data,
hvt.results,
hmap.cols = NULL,
child.level = 1,
...)The important parameters for the function predictHVT are
as below:
data - A dataframe containing the
test dataset. The dataframe should have all the variable(features) used
for training. The variables from this dataset can also be used to
overlay as heatmap.
hvt.results - A list of hvt.results
obtained from the HVT function while performing hierarchical vector
quantization on training data. The list containes detailed information
about the hierarchical vector quantized data along with a summary
section containing no of points, Quantization Error and the centroids
for each cell, as per the n_cells given to the HVT() function.
hmap.cols - The column number of
column name from the dataset indicating the variables for which the heat
map is to be plotted. A heatmap won’t be plotted if NULL is passed
(Default = NULL)
child.level - A number indicating
the level for which the heat map is to be plotted (Only used if
hmap.cols is not NULL) Each level represents a different level of
clustering or partitioning of the data.
normalize - A logical value
indicating if the columns in your dataset should be normalized.
Basically it is a technique that scales the values of each variable to
have a mean of 0 and a standard deviation of 1.. Default value is
TRUE.
distance_metric - It specifies the
type of distance measurement used to calculate similarity or
dissimilarity between data points. It can be set to “Euclidean”
(default) for straight-line distance or “Manhattan” for the sum of
absolute differences between coordinates.
error_metric - It specifies the
error metrics to be used for evaluating the performance of the model. It
can be “mean” or “max”. mean is selected by default.
yVar - Name of the dependent
variable(s)
... - color.vec and line.width can
be passed from here
set.seed(240)
predictions <- muHVT::predictHVT(
testComputers,
hvt.results,
child.level = 1,
line.width = c(1.2),
color.vec = c("#141B41"),
quant.error.hmap = 0.2,
n_cells.hmap = 440,
normalize = TRUE
)Let’s see which cell and level each point belongs to and check the mean absolute difference. For the sake of brevity, we will only show the first 10 rows
summary_list <- hvt.results[[3]]
train_colnames <- names(summary_list[["nodes.clust"]][[1]][[1]])
scaled_test_data <- scale(
testComputers[, train_colnames],
center = summary_list$scale_summary$mean_data[train_colnames],
scale = summary_list$scale_summary$std_data[train_colnames])testComputers <- scaled_test_data
data1 <- data.frame(testComputers)
data1$Row.No <- row.names(testComputers)
data1 <- data1 %>% dplyr::select(Row.No,price,speed,hd,ram,screen,ads)
colnames(data1) <- c("Row.No","price_act","speed_act","hd_act","ram_act","screen_act","ads_act")
data2 <- predictions[["scoredPredictedData"]]
data2 <- data2 %>% dplyr::select(Cell.ID,price,speed,hd,ram,screen,ads)
colnames(data2) <- c("Cell.ID","price_pred","speed_pred","hd_pred","ram_pred","screen_pred","ads_pred")
combined <- cbind(data1,data2)
combined$diff <- rowMeans(abs(combined[, c("price_act","speed_act","hd_act","ram_act","screen_act","ads_act")] - combined[, c("price_pred","speed_pred","hd_pred","ram_pred","screen_pred","ads_pred")]))
rownames(combined) <- NULL
options(scipen = 999)
combined %>% head(100) %>%
as.data.frame() %>%
Table(scroll = T, limit = 10)| Row.No | price_act | speed_act | hd_act | ram_act | screen_act | ads_act | Cell.ID | price_pred | speed_pred | hd_pred | ram_pred | screen_pred | ads_pred | diff |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3 | -1.0785679 | -1.2710382 | -0.9472574 | -0.7589986 | 0.4307274 | -1.721306 | 24 | -1.0786 | -1.2710 | -0.9473 | -0.7590 | 0.4307 | -1.7213 | 0.0000246 |
| 4 | -0.6386691 | -1.2710382 | -0.9472574 | -0.0501185 | -0.6741149 | -1.721306 | 143 | -0.6387 | -1.2710 | -0.9473 | -0.0501 | -0.6741 | -1.7213 | 0.0000252 |
| 7 | -0.8620823 | -1.2710382 | -0.9472574 | -0.7589986 | -0.6741149 | -1.721306 | 29 | -0.8621 | -1.2710 | -0.9473 | -0.7590 | -0.6741 | -1.7213 | 0.0000201 |
| 10 | 0.6186793 | -0.0817113 | -0.7914378 | -0.7589986 | 0.4307274 | -1.721306 | 237 | 0.6187 | -0.0817 | -0.7914 | -0.7590 | 0.4307 | -1.7213 | 0.0000174 |
| 11 | -0.0394370 | -0.8904536 | -0.9472574 | -0.0501185 | 0.4307274 | -1.721306 | 223 | -0.0394 | -0.8905 | -0.9473 | -0.0501 | 0.4307 | -1.7213 | 0.0000296 |
| 14 | 0.1337515 | -1.2710382 | -0.6550957 | -0.0501185 | -0.6741149 | -1.721306 | 162 | 0.1338 | -1.2710 | -0.6551 | -0.0501 | -0.6741 | -1.7213 | 0.0000217 |
| 15 | 0.8334330 | -0.0817113 | -0.7836469 | -0.0501185 | -0.6741149 | -1.721306 | 215 | 0.8334 | -0.0817 | -0.7836 | -0.0501 | -0.6741 | -1.7213 | 0.0000218 |
| 19 | -0.2126255 | -0.8904536 | -0.6356183 | -0.7589986 | 0.4307274 | -1.721306 | 165 | -0.2126 | -0.8905 | -0.6356 | -0.7590 | 0.4307 | -1.7213 | 0.0000208 |
| 22 | 0.9996940 | 0.6794579 | -1.1030770 | -0.7589986 | -0.6741149 | -1.721306 | 214 | 0.9997 | 0.6795 | -1.1031 | -0.7590 | -0.6741 | -1.7213 | 0.0000156 |
| 24 | 1.1382448 | -0.0817113 | -0.7914378 | -0.7589986 | 2.6404120 | -1.721306 | 379 | 1.1382 | -0.0817 | -0.7914 | -0.7590 | 2.6404 | -1.7213 | 0.0000189 |
hist(combined$diff, breaks = 20, col = "blue", main = "Mean Absolute Difference", xlab = "Difference")Figure 22: Mean Absolute Difference
We can see the predictions for the points in the table above.The centroid of the cell that the point is mapped to is the codeword (predictor) for that cell.
Example I: muHVT with the Torus dataset
We have considered torus dataset for multidimensional data visualization using sammons projection.
We have randomly selected 9000 datapoints for testing and remaining datapoints for validation.
Our goal is to achieve data compression upto atleast
80%
We constructed a compressed HVT map (hvt.torus) by applying the
HVT() on the torus dataset. We set the parameters as follows:
n_cells = 100, quant.error = 0.1, and
depth = 1. Upon analyzing the compression summary, we found
that none of the 100 cells exceeded the quantization threshold
error.
We created another compressed HVT map (hvt.torus2) using the
HVT() algorithm on the torus dataset. This time, we adjusted the
parameters to n_cells = 300,
quant.error = 0.1, and depth = 1. After
examining the compression summary, we discovered that 2% of the cells
have reached the quantization threshold error.
Once again, we generated a compressed HVT map (hvt.torus3) using
the HVT() algorithm on the torus dataset. The parameters for this map
were set to n_cells = 900, quant.error = 0.1,
and depth = 1. Upon analyzing the compression summary, we
found that 85% of the 100 cells have reached the quantization threshold
error and we can clearly visualize the 3D torus(donut) in 2D
space.
Example II: muHVT with the Personal Computer dataset
We have considered computers dataset for generating predictions to see which cell and level each point belongs to.
We have randomly selected 80% of datapoints for training and rest 20% for validation.
Our goal is to achieve data compression upto atleast
80%
We construct a compressed HVT map using the HVT() on the training
dataset by setting n_cells to 440 and
quant.error to 0.2, and we were able to
attain a compression of 81%
We then plot the Voronoi Tessellation with the heatmap overlaid for all the features in the computers dataset for better visualization
Next, we pass the validation dataset along with the HVT map
obtained from HVT() to predictHVT() to see
which cell and level each point belongs to
Pricing Segmentation - The package can be used to discover groups of similar customers based on the customer spend pattern and understand price sensitivity of customers
Market Segmentation - The package can be helpful in market segmentation where we have to identify micro and macro segments. The method used in this package can do both kinds of segmentation in one go
Anomaly Detection - This method can help us categorize system behavior over time and help us find anomaly when there are changes in the system. For e.g. Finding fraudulent claims in healthcare insurance
The package can help us understand the underlying structure of the data. Suppose we want to analyze a curved surface such as sphere or vase, we can approximate it by a lot of small low-order polygons in the form of tessellations using this package
In biology, Voronoi diagrams are used to model a number of different biological structures, including cells and bone microarchitecture
Using the base idea of Systems Dynamics, these diagrams can also be used to depict customer state changes over a period of time
Topology Preserving Maps : https://users.ics.aalto.fi/jhollmen/dippa/node9.html#:~:text=The%20property%20of%20topology%20preserving,tool%20of%20high%2Ddimensional%20data
Vector Quantization : https://en.wikipedia.org/wiki/Vector_quantization
Sammon’s Projection : http://en.wikipedia.org/wiki/Sammon_mapping
Voronoi Tessellations : http://en.wikipedia.org/wiki/Centroidal_Voronoi_tessellation
Embedding : https://en.wikipedia.org/wiki/Embedding